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Summary of the Inv ntion 

In one aspect the invention provides a single chain multi-functional biosynthetic protein expressed from 
a single g ne derived by recombinant DNA techniques, said protein comprising: 

a biosynthetic antibody binding site capable of binding to a preselected antigenic determinant and 
comprising an amino acid sequence homologous with the sequence of a variable region of an im- 
munoglobulin molecule capable of binding said preselected antigenic determinant. 

• • a first- biofunctional domain comprising a -polypeptide selected from ;he group consisting of effector ^ 
proteins having a conformation suitable for biological activity in mammals, amino acid 3equences capable of 
sequestering an ion. and amino acid sequences capable of selective binding to a solid support, and 

a first polypeptide linker disposed between said binding site and said first biofunctional domain, wherein 
said polypeptide linker comprises plural, hydrophilic. peptide-bonded amino acids and which defines a 
polypeptide which connects the C-terminal end of said binding site and the N-terminal end of said first 
biofunctional domain or the N-terminal end of said binding site and the C-terminal end of said first 
biofunctional domain, whereupon said binding protein assumes a conformation suitable for binding and said 
first biofunctional domain assumes a conformation suitable for biological activity, sequestenng an ion. or 
selectively binding a solid support. 

The binding site may comprise at least two binding domains peptide bonded by a second polypeptide 
linker disposed between said domains, wherein said second polypeptide linker comprises plural, hydro- 
20 philic peptide-bonded amino acids and which defines a polypeptide of a length sufficient to span the 
distance between the C-terminal end of one of said binding domains and the N-terminal end of the other of 
said binding domains when said binding protein assumes a conformation suitable for binding when 
disposed in aqueous solution. 

The amino acid sequence of each of the binding domains may comprise a set of CDRs (Complemen- 
tarity determining regions) interposed between a set of FRs (Framework regions), each of which ts 
respectively homologous with CDRs and FRs from a said variable region of an immunoglobulin molecule 
capable of binding said preselected antigenic determinant. 

At least one of said binding domains may comprise a said set of CDRs homologous with the CDRs m a 
first immunoglobulin and a set of FRs homologous with the FRs in a second, distinct immunoglobulin. 
In another aspect, the invention provides a single polypeptide chain comprising: 

a pair of polypeptide domains together defining a site for binding a preselected antigen, and being 
joined through the C-terminus of one to the N-terminus of the other by a polypeptide linker, wherein the 
amino acid sequence of each of said polypeptide domains mimics an immunoglobulin vanable region, and 
at least one said domain comprises: 

a set of CDR amino acid sequences together defining a recognition site for said preselected antigen, 
wherein said CDR sequences are non-human sequences. 

a set of FR amino acid sequences linked to said set of CDR sequences, wherein said FR ammo acid 
sequences are homologous to sequences obtained from a human immunoglobulin, and 

said linked sets of CDR and FR amino acid sequences together defining a chimeric binding domain 
which, when disposed in aqueous solution, assumes a tertiary structure suitable for immunological binding 
with said preselected antigen. 

In still another aspect, the invention provides a single polypeptide chain compnsing: 
a pair of polypeptide domains defining a site for binding a preselected antigen joined by a polypeptide 
linker spanning the distance between the C-terminal of one to the N-terminal of the other, wherein the 
amino acid sequence of at least one of said polypeptide domains comprises a recombinant vanable region 
comprising: 

a set off CDR amino acid sequences together defining a recognition site for said preselected antigen, 
wherein said CDR sequences are homologous to sequences obtained from a first immunoglobulin. 

a set of FR amino acid sequences linked to said set of CDR sequences, wherein said FR ammo acid 
sequences are homologous to sequences obtained from a second immunoglobulin, and 

said linked sets of CDR and Fr amino acid sequences together defining a chimeric single chain vanable 
region binding polypeptide which, when disposed in aqueous solution, assumes a tertiary structure suitable 
for immunological binding with said preselected antigen. 

In preferred aspects, the FRs of the binding protein are homologous to at least a portion of the FRs 
from a human immunoglobulin, the linker spans at least about 40 angstroms: a polypeptide spacer is 
incorporated in the multifunctional protein between the binding sit and the second polypeptide; and the 
binding protein has an affinity for the preselected antigenic determinant no less than two orders of 
magnitude less than the binding affinity of the immunoglobulin molecule used as a template for the CDR 
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regions of the binding protein. The preferred link rs and spac rs are cysteine-free. The linker preferably 
comprises amino acids having unreactive side groups, e.g.. alanin and glycin . Linkers and spacers can be 
mad by combining plural consecutive copies of an amino acid sequence, e.g.. (Gly« Ser)3. The invention 
also provides DNAs encoding th se proteins and host cells harboring and capabi of expr ssing these 
5 DNAs. 

As used herein, the phras . biosynthetic antibody binding site or BABS means synthetic proteins 
expr ssed from DNA dehv d by recombinant techniques. BABS comprise biosynlhetically produced 
sequences of amino acids defining polypeptides designed to bind with a preselected antigenic material. The 
structure of theSe synthetic polypeptides 'is unlike' tl^at of naturally occurring antibodies, fra'gments thereof. • 

10 e.g.. Fv. or known synthetic polypeptides or "chimeric antibodies" in that -the regions of the BABS 
responsible for specificity and affinity of binding, (analogous to native antibody variable regions) are linked 
by peptide bonds, expressed from a single DNA, and may themselves be chimeric, e.g.. may comprise 
amino acid sequences homologous to portions of at least two different antibody molecules. The BABS 
embodying the invention are biosynthetic in the sense that they are synthesized in a cellular host mad to 

;5 express a synthetic DNA. that is. a recombinant DNA made by ligation of plural, chemically synthesized 
oligonucleotides, or by ligation of fragments of DNA derived from the genome of a hybridoma, mature B cell 
clone, or a cDNA library derived from such natural sources. The proteins of the invention are properly 
characterized as **binding sites'* in that these synthetic molecules are designed to have specific affinity for a 
preselected antigenic determinant. The polypeptides of the invention comprise structures patterned after 

20 regions of native antibodies known to be responsible for antigen recognition. 

Accordingly, it is an object of the invention to provide novel multifunctional proteins comprising one or 
more effector proteins and one or more biosynthetic antibody binding sites, and to provide DNA sequences 
which encode the proteins. Another object is to provide a generalized method for producing biosynthetic 
antibody binding site polypeptides of any desired specificity. 

25 

Brief Description of the Drawing 

The foregoing and other objects of this invention, the various features thereof, as well as the invention 
itself, may be more fully understood from the following description, when read together with the accom- 
30 panying drawings. 

Figure 1A is a schematic representation of an intact IgG antibody molecule containing two light chains, 
each consisting of one variable and one constant domain, and two heavy chains, each consisting of one 
variable and three constant domains. Figure IB is a schematic drawing of the structure of Fv proteins (and 
DNA encoding them) illustrating Vh and Vl domains, each of which comprises four framework (FR) regions 

35 and three complementarity determining (CDR) regions. Boundaries of CDRs are indicated, by way of 
example, for monoclonal 26-10. a well known and characterized murine monoclonal specific for digoxin. 

Figure 2A-2E are schematic representations of some of the classes of reagents constructed in 
accordance with the invention, each of which comprises a biosynthetic antibody binding site. 

Figure 3 discloses five amino acid sequences (heavy chains) in single letter code lined up vertically to 

40 facilitate understanding of the invention. Sequence 1 is the known native sequence of Vh from murin 
monoclonal glp-4 (anti-lysozyme). Sequence 2 is the known native sequence of Vh from murine monoclonal 
2S-10 (anti-digoxin). Sequence 3 is a BABS comprising the FRs from 26-10 Vh and the CDRs from glp-4 
Vh. The CDRs are identified in lower case letters: restriction sites in the DNA used to produce chim ric 
sequence 3 are also identified. Sequence 4 is the known native sequence of Vh from human myeloma 

45 antibody NEWM. Sequence 5 is a BABS comprising the FRs from NEWIVI Vh and the CDRs from glp-4 Vh, 
i.e., illustrates a "humanized" binding site having a human framework but an affinity for lysozyme similar to 
murine glp-4. 

Figures 4A-4F are the synthetic nucleic acid sequences and encoded amino acid sequences of (4A) the 
heavy chain variable domain of murine anti-digoxin monoclonal 26-10: (4B) the light chain variable domain 

50 of murine anti-digoxin monoclonal 26-10: (4C) a heavy chain variable domain of a BABS comprising CDRs 
of glp>-4 and FRs of 26-10; (4D) a light chain variable region of the same BABS; (4E) a heavy chain variable 
region of a BABS comprising CDRs of glp-4 and FRs of NEWM; and (4F) a light chain variable region 
comprising CDRs of glp-4 and FRs of NEWIVJ. Delineated are FRs, CDRs, and restriction sites for 
endonuclease digestion, most of which were introduced during design of the DNA. 

55 Figure 5 is the nucleic acid and encoded amino acid sequence of a host DNA (Vh) designed to facilitate 
insertion of CDRs of choice. The DNA was designed to have unique 6-base sites directly flanking the CDRs 
so that relatively small oligonucleotides defining portions of CDRs can be readily inserted, and to have other 
sites to facilitate manipulation of the DNA to optimize binding properties in a given construct. The 
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framework regions of th molecule corr spond to murin FRs (Figure 4A). 

Figures 6A and 6B ar multifunctional prot ins (and DNA encoding th m) comprising a single chain 

BABS with the specificity of murin monoclonal 26-10. linked through a spacer to the FB fragment of 

protein A. h r fused as a I ader. and constituting a binding site for Fc. The spacer comprises »he 1 1 C- 
5 terminal amino acids of the FB followed by Asp-Pro (a dilute acid cleavage sit ). The single chain BABS 

comprises sequences mimicking the Vh and Vl (6A) and the Vl and Vh (6B) of murine monoclonal 26-10. 

The Vl in construct 6A Is altered at residue 4 where valine replaces methionine present in the parent 26-10 

sequence. These constructs contain binding sites ,for both Fc and digoxin. Their structure may be 
' summarized as: *" * • • ....... , . . , . 

10 (6A) FB-Asp-Pro-VH-(Gly*-Serb-VL. 

and 

(6B) FB-Asp-Pro-Vt-(G!yt-Serh-VH. 
where (Gly4-Ser)3 is a p)otypeptide linker. 

In Rgures 4A-4E and 6A'and 6B, the amino acid sequence of the expression products start after the 
/5 GAATTC sequences, which codes for an EcoRI splice site, translated as Glu-Phe on the drawings. 

Figure 7A is a graph of percent of maximum counts bound of radioiodinated digoxin versus concentra- 
tion of binding protein adsorbed to the plate comparing the binding of native 26-10 (curve 1) and the 
construct of Figure 6A and Figure 2B renatured using two different procedures (curves 2 and 3). Figure 7B 
is a graph demonstrating the bifunctionality of the FB-(26-10) BABS adhered to microtiter plates through the 
20 specific binding of the binding site to the digoxin-BSA coat on the plate. Figure 7B shows the percent 
inhibition of ^2S|-rabbit-lgG binding to the FB domain of the FB BABS by the addition of IgG. protein A. FB. 
murine lgG2a. and murine IgGI. 

Figure 8 is a schematic representation of a model assembled DNA sequence encoding a multifunctional 
biosynthetic protein comprising a leader peptide (used to aid expression and thereafter cleaved), a binding 
25 site, a spacer, and an effector molecule attached as a trailer sequence. 

Figure 9A-9E are exemplary synthetic nucleic acid sequences and corresponding encoded amino acid 
sequences of binding sites of different specificities: (A) FRs from NEWM and CDRs from 26-10 having the 
digoxin specificity of murine monoclonal 26-10: (B) FRs from 26-10, and CDRs from G-loop-4 (glp-4) having 
lysozyme specificity: (C) FRs and CDRs from I^OPC-315 having dinitrophenol (DNF) specificity; (D) FRs 
30 and CDRs from an anti-CEA monoclonal antibody; (E) FRs in both Vh and Vl and CDRi and CDRo in Vh. 
and CDRt. CDR2. and CDRj in Vl from an anti-CEA monoclonal antibody: CDR? in Vh is a CDR? 
consensus sequence found in most immunoglobulin Vh regions. 

Figure 10A is a schematic representation of the DNA and amino acid sequence of a leader peptide 
(MLE) protein with corresponding DNA sequence and some major restriction sites. Figure 10B shows the 
35 design of an expression plasmid used to express MLE-BABS (26-10). During construction of the gene, 
fusiorr* partners were joined at the EcoRI site that Is shown as part of the leader sequence. The pBR322 
plasmid. opened at 'the unique Sspl and PstI sites, was combined In a 3-part ligation with an Sspl to EcoRI 
fragment bearing the trp promoter and MLE leader and with an EcoRI to PstI fragment carrying the BABS 
gene. The resulting expression vector confers tetracycline resistance on positive transformants. 
40 Figure 11 is an SDS-polyacrylamide gel (15%) of the (26-10) BABS at progressive stages of 
purification. Lane 0 shows low molecular weight standards: lane 1 Is the I^LE-BABS fusion protein; lane 2 Is 
an acid digest of this material: lane 3 is the pooled DE-52 chromatographed protein: lanes 4 and 5 are the 
same oubain-Sepharose® pool of single chain BABS except that lane 4 protein is reduced and lane 5 
protein is unreduced. 

45 Figure 12 shows Inhibition curves for 26-10 BABS and 26-10 Fab species, and indicates the relative 
affinities of the antibody fragment for the indicated cardiac glycosides. 

Figures 13A and 13B are plots of digoxin binding curves. (A) shows 26-10 BABS binding isotherm and 
Sips plot (inset), and (B) shows 26-10 Fab binding isotherm and Sips plot (Inset). 

Figure 14 is a nucleic acid sequence and corresponding amino acid sequence of a modified FB dimer 
50 leader sequence and various restriction sites. 

Figure 15A-15H are nucleic acid sequences and corresponding amino acid sequences of biosynthetic 
multifunctional proteins including a single chain BABS and various biologically active protein trailers linked 
via a spacer sequence. Also indicated are various endonuclease digestion sites. The trailing sequences are 
(A) epidermal growth factor (EGF): (B) slreptavidin; (C) tumor necrosis factor (TNF): (D) calmodulin: (E) 
55 platelet derived growth factor-beta (PDGF-beta); (F) ricin: and (G) interleukin-2. and (H) an FB-FB dimer. 
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Description 

The inv ntion will first be descriljed in its broadest overall aspects with a more detailed description 
following. 

5 A class of nov I biosynthetic. bi or multifunctional proteins has now been designed and engineered 
which compris biosynthetic antibody binding sites, that i.s. "BABS" or biosynthetic polypeptides defining 
structure capable of s lective antigen recognition and prer-- rntial antigen binding, and one or more peptide- 
bonded additional protein or polypeptide regions designee have a preselected property. Examples of the 
• • second I'egion Include amino acid sequences designed to sequester iohs,- which' makes the pi-oteiri soitablfe 

to for use as an imaging agent, and sequences designed to facilitate immobilization of the protein for use in 
affinity chromatography and solid phase immunoassay. Another example of the second region is a bioaclive 
effector molecule, that is, a protein having a conformation suitable for biological activity, such as an 
enzyme, toxin, receptor, binding site, growth factor, ceil differentiation factor, lymphokine. cytokine, 
hormone, or anti-metabolite. This invention features synthetic, multifunctional proteins comprising th se 

;5 regions peptide bonded to one or more biosynthetic antibody binding sites, synthetic, single chain proteins 
designed to bind preselected antigenic determinants with high affinity and specificity, constructs containing 
multiple binding sites linked together to provide multipoint antigen binding and high net affinity and 
specificity. DNA encoding these proteins prepared by recombinant techniques, host cells harboring these 
DNAs. and methods for the production of these proteins and DNAs. 

20 The invention requires recombinant production of single chain binding sites having affinity and 
specificity for a predetermined antigenic determinant. This technology has been developed and is disclos d 
herein. In view of this disclosure, persons skilled in recombinant DNA technology, protein design, and 
protein chemistry can produce such sites which, when disposed in solution, have high binding constants (at 
least 10^. preferably 10^ M~\) and excellent specificity. 

25 The design of the BASS is based on the observation that three subregions of the variable domain of 
each of the heavy and light chains of native immunoglobulin molecules collectively are responsibi for 
antigen recognition and binding. Each of these subregions, called herein "complementarity determining 
regions" or CDRs, consists of one of the hypervariabte regions or loops and of selected amino acids or 
amino acid sequences disposed in the framework regions or FRs which flank that particular hypervariabte 

30 region. It has now been discovered that FRs from diverse species are effective to maintin CDRs from 
diverse other species in proper conformation so as to achieve true immunochemical binding properties in a 
biosynthetic protein. It has also been discovered that biosynthetic domains mimicking the structure of the 
two chains of an immunoglobulin binding site may be connected by a polypeptide linker while closely 
approaching, retaining, and often improving their collective binding properties. 

35 The binding site region of the multifunctional proteins comprises at least one, and preferably two 
domains, each of which has an amino acid sequence homologous to portions of the CDRs of the variable 
- domain of an immunoglobulin light or heavy chain, and other sequence homologous to the FRs of th 
variable domain of the same, or a second, different immunoglobulin light or heavy chain. The two domain 
binding site construct also includes a polypeptide linking the domains. Polypeptides so constructed bind a 

40 specific preselected antigen determined by the CDRs held in proper conformation by the FRs and th 
linker. Preferred structures have human FRs. i.e., mimic the amino acid sequence of at least a portion of the 
framework regions of a human immunoglobulin, and have linked domains which together comprise structure 
mimicking a Vh-Vl or Vl-Vh immunoglobulin two-chain binding site. CDR regions of a mammalian 
immunoglobulin, such as those of mouse, rat. or human origin are preferred. In one preferred embodim nt. 

45 the biosynthetic antibody binding site comprises FRs homologous with a portion of the FRs of a human 
immunoglobulin and CDRs homologous with CDRs from a mouse or rat immunoglobulin. This type of 
chimeric polypeptide displays the antigen binding specificity of the mouse or rat immunoglobulin, while its 
human framework minimizes human immune reactions. In addition, the chimeric polypeptide may comprise 
other amino acid sequences. It may comprise, for example, a sequence homologous to a portion of the 

50 constant domain of an immunoglobulin, but preferably is free of constant regions (other than FRs). 

The binding site region(s) of the chimeric proteins are thus single chain composite polypeptides 
comprising a structure which in solution behaves like an antibody binding site. The two domain, single chain 
composite polypeptide has a structure patterned after tandem Vh and Vl domains, but with the carboxyl 
terminal of one attached through a linking amino acid sequence to the amino terminal of the other. The 

55 linking amino acid sequence may or may not itself be antigenic or biologically active. It preferably spans a 
distance of at least about 40A. i.e.. comprises at (east about 14 amino ccids. and comprises residues which 
together present a hydrophilic. relatively unstructured region. Linking amino acid sequenc s having little or 
no secondary structure work well. Optionally, one or a pair of unique amino acids or amino acid sequences 



7 



EP 0 318 554 B1 



recognizabi by a site specific cleavage agent may be included in the linker. This pernntts the Vh and Vl- 
lik domains to be separated after expression, or the linker to be excised aft r r folding of the binding sit 

Eith r the amino or carboxyl terminal ends (or both ends) of these chim ric. single chain binding sites 
ar attached to an amino acid sequence which itself is bioactive or has som other function to produc a 

5 bifunctional or multifunctional protein. For example, the synthetic binding site may include a leader and or 
trailer sequ nee defining a polypeptide having enzymatic activity, independent affinity for an antigen 
different from the antigen to which the binding site is directed, or having other functions such as to provide 
a convenient site of attachment for a radioactive ion, or to provide a residue designed to link chemically to a 
solid -support. This fused, indepeod'ently functional section o^ protein should be distinguished from fused- 

JO leaders used simply to enhance expression in prokaryotic host cells or yeasts. The multifunctional proteins 
also should be distinguished from the "conjugates" disclosed in the prior art comprising antibodies which, 
after expression, are linked chemically to a second moiety. 

Often, a series of amino acids designed as a "spacer" is interposed between the active regions of the 
multifunctional protein. Use of such a spacer can promote independent refolding of the regions of the 

;s protein. The spacer also may include a specific sequence of amino acids recognized by an endopeptidase. 
for example, endogenous to a target cell (e.g., one having a surface protein recognized by the binding site) 
so that the bioactive effector protein is cleaved and released at the target. The second functional protein 
preferably is present as a trailer sequence, as trailers exhitMt less of a tendency to interfere with the binding 
behavior of the BABS, 

20 The therapeutic use of such "self-targeted" bioactive proteins offers a number of advantages ov r 
conjugates of immunoglobulin fragments or complete antibody molecules: they are stable, less im- 
munogenic and have a tower molecular weight; they can penetrate body tissues more rapidly for purposes 
of imaging or drug delivery because of their smaller size; and they can facilitate accelerated clearance of 
targeted isotopes or drugs. Furthermore, because design of such structures at the DNA level as disclosed 

25 herein permits ready selection of bioproperties and specificities, an essentially limitless combination of 
binding sites and bioactive proteins is possible, each of which can be refined as disclosed herein to 
optimize independent activity at each region of the synthetic protein. The synthetic proteins can be 
expressed in procaryotes such as E. coli, and thus are less costly to produce than immunoglobulins or 
fragments thereof which require expression in cultured animal ceil lines. 

JO The invention thus provides a family of recombinant proteins expressed from a single piece of DNA, all 
of which have the capacity to bind specifically with a predetermined antigenic determinant. The preferred 
species of the proteins comprise a second domain which functions independently of the binding region. In 
this aspect the invention provides an array of "self-targeted" proteins which have a bioactive function and 
which deliver that function to a locus determined by the binding site's specificity. It also provides 

35 biosynthetic binding proteins having attached polypeptides suitable for attachment to immobilization 
matrices which may be used in affinity chromatography and solid phase immunoassay applications, or 
suitable for attachment to ions, e.g., radioactive ions, which may be used for in vivo imaging. 

The successful design and manufacture of the proteins of the invention .depends on the ability to 
produce biosynthetic binding sites, and most preferably, sites comprising two domains mimicking the 

40 variable domains of immunoglobulin connected by a linker. 

As is now well known. Fv. the minimum antitxnjy fragment which contains a complete antigen 
recognition and binding site, consists of a dimer of one heavy and one tight chain variable domain in 
noncovalent association (Figure 1A), It is in this configuration that the three complementarity determining 
regions of each variable domain interact to define an antigen binding site on the surface of the Vh-Vl dimer. 

45 Collectively, the six complementarity determining regions (see Figure 1 B) confer antigen binding specificity 
to the antibody. FRs flanking the CDRs have a tertiary structure which is essentially conserved in nativ 
immunoglobulins of species as diverse as human and mouse. These FRs serve to hold the CDRs in their 
appropriate orientation. The constant domains are not required for binding function, but may aid in 
stabilizing Vh-Vl interaction. Even a single variable domain (or half of an Fv comprising only three CDRs 

50 specific for an antigen) has the ability to recognize and bind antigen, although at a lower affinity than an 
entire binding site (Painter et al. (1972) Biochem. 12:1327-1337). 

This knowledge of the structure of immunoglobulin proteins has now been exploited to develop 
multifunctional fusion proteins comprising biosynthetic antibody binding sites and one or more other 
domains. 

55 The structure of these biosynthetic proteins in the region which impart the binding properties to the 
protein is analogous to the Fv region of a natural antibody. It comprises at least one, and preferably two 
domains consisting of amino acids defining Vh and Vt-like polypeptide segments connected by a linker 
which together form the tertiary molecular structure responsible for affinity and specificity. Each domain 
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comprises a set of amino acid sequences analogous to immunoglobulin CDRs held in approoriate 
conformation by a set of sequences analogous to the framework regions (FRs) of an Fv fragment of 
natural antibody. * d 

The term COR. as used herein, refers to amino acid sequences which tog ther d fin th brndino 
5 affinity and specificity of the natural Fv region of a native immunoglobulin binding site, or a synthetic 
polypeptide which mimics this function. CDRs typically arc not wholly homologous to hypervariable regions 
of natural Fvs. but rather also may include specific amino acids or amino acid sequences which flank the 
hypervariable region and have heretofore been considered framework not directly determinitive of com- 
10 ^'^p^"*^"^^"^^® ^^'"^ ^^''^^ ""^^^ to arhiho -acid-sequences -flanking- orinterp05ed between 

The COR and FR polypeptide segments are designed based on sequence analysis of the Fv region of 
preexisting antibodies or of the DNA encoding them. In one embodiment, the amino acid sequences 
constituting the FR regions of the BA8S are analogous to the FR sequences of a first preexisting antibody 
for example, a human IgG: The amino acid sequences constituting the COR regions are analogous to the 
;5 sequences from a second, different preexisting antibody, for example, the CDRs of a murine IgG 
Alternatively, the CDRs and FRs from a single preexisting antibody from. e.g.. an unstable or hard to culture 
hybridoma. may be copied in their entirety. 

Practice of the invention enables the design and biosynthesis of various reagents, ail of which are 
characterized by a region having affinity for a preselected antigenic determinant. The binding site and other 
20 regions of the biosynthetic protein are designed with the particular planned utility of the protein in mind. 
Thus. If the reagent is designed for intravascular use in mammals, the FR regions may comprise amino 
acids similar or identical to at least a portion of the framework region amino acids of antibodies r>ative to 
that mammalian species. On the other hand, the amino acids comprising the CDRs may be analogous to a 
portion of the amino acids from the hypervariable region, (and certain flanking amino acids) of an antibody 
having a known affinity and specificity, e.g.. a murine or rat monoclonal antibody. 

Other sections of native immunoglobulin protein structure, e.g.. Ch and C^. need not be present and 
normally are intentionally omitted from the biosynthetic proteins. However, the proteins of the invention 
normally comprise additional polypeptide or protein regions defining a bioactive region, e.g., a toxin or 
enzyme, or a site onto which a toxin or a remotely detectable substance can be attached. 

The invention thus can provide intact biosynthetic antibody binding sites analogous to Vh-Vl dimers 
either non-covalently associated, disulfide bonded, or preferably linked by a polypeptide sequence to form 
a composite Vh-Vl or V,-Vh polypeptide which may be essentially free of antibody constant region The 
invention also provides proteins analogous to an independent Vh or domain, or dimers thereof Any of 
these proteins may be provided in a form linked to. for example, amino acids analogous or homologous to a 
35 bioactive molecule such as a hormone or toxin. 

' Connecting the independently functional regions of the protein is a spacer comprising a short amino 
acid sequence whose function is to separate the functional regions so that they can independently assume 
their active tertiary conformation. The spacer can consist of an amino acid sequence present on the end of 
a functional protein which sequence is not itself required for its function, and/or specific sequences 
40 engineered into the protein at the DNA level. 

The spacer generally may comprise between 5 and 25 residues. Its optimal length may be determined 
using constructs of different spacer lengths varying, for example, by units of 5 amino acids. The specific 
ammo acids in the spacer can vary. Cysteines should be avoided. Hydrophilic amino acids are preferred 
The spacer sequence may mimic the sequence of a hinge region of an immunoglobulin It may also be 
designed to assume a structure, such as a helical structure. Proteolytic cleavage sites may be designed 
into the spacer separating the variable region-Iike sequences from other pendant sequences so as to 
facilitate cleavage of intact BABS, free of other protein, or so as to release the bioactive protein in vivo. 

Figures 2A-2E illustrate five examples of protein structures embodying the invention thaTTan be 
produced by following the teaching disclosed herein. All are characterized by a biosynthetic polypeptide 
defining a binding site 3. comprising amino acid sequences comprising CDRs and FRs, often derived from 
different immunoglobulins, or sequences homologous to a portion of CDRs and FRs from different 
immunoglobulins. Figure 2A depicts a single chain construct comprising a polypeptide domain 10 having an 
ammo acid sequence analogous to the variable region of an immunoglobulin heavy chain, bound through its 
carboxyl end to a polypeptide linker 12. which in turn is bound to a polypeptide domain 14 having an amino 
acid sequence analogous to the variable region of an immunoglobulin light chain. Of course, the light and 
heavy chain domains may be in reverse order. Alternatively, the binding site may comprise two substan- 
tially homologous ammo acid sequences which are both analogous to the variable region of an im- 
munoglobulin heavy or light chain. 
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The linker 12 should be long enough (e.g.. about 15 amino acids or about 40 A to permit the chains 10 
and 14 to assume th ir proper conformation. The linker 12 may compris an amino acid sequenc 
homologous to a s quence identified as "self by the species into which it will be introduced, if drug us is 
intended. For example, th link r may comprise an amino acid sequence patt rn d aft r a hing region of 
5 an immunoglobulin. The linker preferably comprises hydrophilic amino acid sequences. It may also 
comprise a bioactive polypeptide such as a cell toxin which is to be targeted by the binding site, or a 
segment easily labelled by a radioactive reagent which is to be delivered, e.g.. to th site of a tumor 
comprising an epitope recognized by the binding site. The linker may also include one or two built-in 
• ' * cleavage 'site^. i.e.. an amino afcid or ammo acid- sequence susceptible to* attack by ijsite' specific cleavage 

i(r agent as described below. This strategy permits the V„ and V|.-like domains to be separated after 
expression, or the linker to be excised after folding while retaining the binding site structure in non-covalent 
association. The amino acids of the linker preferably are selected from among those having relatively small, 
unreactive side chains. Alanine, serine, and glycine are preferred. 

Generally, the design of the linker involves considerations similar to the design of the spacer, excepting 

/5 that binding properties of the linked domains are seriously degraded if the linker sequence is shorter than 
about 20A in length, i.e.. comprises less than about 10 residues. Linkers longer than the approximate 40A 
distance between the N terminal of a native variable region and the C-terminal of its sister chain may be 
used, but also potentially can diminish the BABS binding properties. Linkers comprising between 12 and 18 
residues are preferred. The preferred length in specific constructs may be determined by varying linker 

20 length first by units of 5 residues, and second by units of 1-4 residues after determining the best multiple of 
the pentameric starting units. 

Additional proteins or polypeptides may be attached to either or both the amino or carboxyl termini of 
the binding site to produce multifunctional proteins of the type illustrated in Figures 2B-2E. As an example, 
in Figure 2B. a helically coiled polypeptide structure 16 comprises a protein A fragment (FB) linked to the 

25 amino terminal end of a Vw-like domain 10 via a spacer 18. Figure 20 illustrates a bifunctional protein 
having an effector polypeptide 20 linked via spacer 22 to the carboxyl terminus of polypeptide 14 of binding 
protein segment 2. This effector polypeptide 20 may consist of. for example, a toxin, therapeutic drug, 
binding protein, enzyme or enzyme fragment, site of attachment for an imaging agent (e.g.. to chelate a 
radioactive ion such as indium), or site of selective attachment to an immobilization matrix so that the BABS 

30 can be used in affinity chromatography or solid phase binding assay. This effector alternatively may be 
linked to the amino terminus of polypeptide 10. although trailers are preferred. Figure 2D depicts a 
trifunctional protein comprising a linked pair of BABS 2 having another distinct protein domain 20 attached 
to the N-termlnus of the first binding protein segment. Use of multiple BABS in a single protein enables 
production of constructs having very high selective affinity for multiepitopic sites such as cell surfac 

35 proteins. 

Tfie independently functional domains are attached by a spacer 18 (Figs 2B and 2D) covalently linking 
the C terminus of the protein 16 or 20 to the N-terminus of the first domain 10 of the binding protein 
segment 2. or by a spacer 22 linking the C-terminus of the second binding domain 14 to the N-terminus of 
another protein (Figs. 2C and 2D). The spacer may be an amino acid sequence analogous to linker 

40 sequerxe 12, or it may take other forms. As noted above, the spacer's primary function is to separate the 
active protein regions to promote their independent bioactivity and permit each region to assume its 
bioactive conformation independent of interference from its neighboring structure. 

Figure 2E depicts another type of reagent, comprising a BABS having only one set of three CDRs, e.g.. 
analogous to a heavy chain variable region, which retains a measure of affinity for the antigen. Attached to 

45 the carboxyl end of the polypeptide 10 or 14 comprising the FR and CDR sequences constituting the 
binding site 3 through spacer 22 is effector polypeptide 20 as described above. 

As is evidenced from the foregoing, the invention provides a large family of reagents comprising 
proteins, at least a portion of which defines a binding site patterned after the variable region of an 
immunoglobulin. It will be apparent that the nature of any protein fragments linked to the BABS, and us d 

50 for reagents emtxxJying the invention, are essentially unlimited, the essence of the invention being th 
provision, either alone or linked to other proteins, of binding sites having specificities to any antigen desired. 

The clinical administration of multifunctional proteins comprising a BABS, or a BABS alone, affords a 
number of advantages over the use of intact natural or chimeric antibody molecules, fragments thereof, and 
conjugates comprising such antibodies linked chemically to a second bioactive moiety. The multifunctional 

55 proteins described herein offer fewer cleavage sites to circulating proteolytic enzymes, their functional 
domains are connected by peptide bonds to polypeptide linker or spacer sequences, and thus the proteins 
have improved stability. Because of their smaller size and efficient design, the multifunctional proteins 
described herein reach their target tissue more rapidly, and are cleared more quickly from th body. They 
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also hav reduced rmmunogentcity. In addition, their design facilitates coupling to other moieties in drug 
targeting and innaging application. Such coupling may be conducted chemically after expression of the 
BABS to a site of attachment for the coupling product engineered into the protein at the DNA level. Active 
effector proteins having toxic, nzymatic. binding, modulating, cell differentiating, hormonal, or other 
5 bioactivity are expressed from a single DNA as a leader and or trailer sequence, peptid bonded to the 
BABS. 

Design and Manufacture 

10 The proteins of the invention are designed at the DNA level. The chimeric or synthetic DNAs are then 
expressed in a suitable host system, and the expressed proteins are collected and renatured if necessary, A 
preferred general structure of the DNA encoding the proteins is set forth in Figure 8. As illustrated, it 
encodes an optimal leader sequence used to promote expression in procaryotes having a built-in cleavage 
site recognizable by a site specific cleavage agent, for example, an endopeptidase, used to remove the 

15 leader after expression. This is followed by DNA encoding a Vn-like domain, comprising CDRs and FRs. a 
linker, a Vfltke domain, again comprising CDRs and FRs, a spacer, and an effector protein. After 
expression, folding, and cleavage of the leader, a bifunctional protein is produced having a binding region 
whose specificity is determined by the CDRs. and a peptide-finked independently functional effector region. 
The ability to design the BABS of the invention depends on the ability to determine the sequence of the 

20 amino acids in the variable region of monoclonal antibodies of interest, or the DNA encoding them. 
Hybridoma technology enables production of cell lines secreting antibody to essentially any desired 
substance that produces an immune response. RNA encoding the light and heavy chains of the im- 
munoglobulin can then be obtained from the cytoplasm of the hybridoma. The 5' end portion of the mRNA 
can be used to prepare cDNA for subsequent sequencing, or the amino acid sequence of the hypervariable 

25 and flanking framework regions can be determined by amino acid sequencing of the V region fragments of 
the H and L chains. Such sequence analysis is now conducted routinely. This knowledge, coupled with 
observations and deductions of the generalized structure, of immunoglobulin Fvs. permits one to design 
synthetic genes encoding FR and CDR sequences which likely will bind the antigen. These synthetic genes 
are then prepared using known techniques, or using the technique disclosed below, inserted into a suitable 

30 host, and expressed, and the expressed protein is purified. Depending on the host cell, renaturation 
techniques may be required to attain proper conformation. The various proteins are then tested for binding 
ability, and one having appropriate affinity is selected for incorporation into a reagent of the type described 
above. If necessary, point substitutions seeking to optimize binding may be made in the DNA using 
conventional casette mutagenesis or other protein engineering methodology such as is disclosed below. 

35 Preparation of the proteins of the invention also is dependent on knowledge of the amino acid sequence 
(or 'corresponding DNA or RNA sequence) of bioactive proteins such as enzymes, toxins, growth factors, 
cell differentiation factors, receptors, anti-metabolites, hormones or various cytokines or lymphokines. Such 
sequences are reported in the literature and available through computerized data banks. 

The DNA sequences of the binding site and the second protein domain are fused using conventional 

40 techniques, or assembled from synthesized oligonucleotides, and then expressed using equally conven- 
tional techniques. 

The processes for manipulating, amplifying, and recombining DNA which encode amino acid sequenc s 
of interest are generally well known in the art. and therefore, not described in detail herein, f^^ethods of 
identifying and isolating genes encoding antibodies of interest are well understood, and described in the 

45 patent and other literature. In general, the methods involve selecting genetic material coding for amino acids 
which define the proteins of interest, including the CDRs and FRs of interest, according to the genetic code. 

Accordingly, the construction of DNAs encoding proteins as disclosed herein can be done using known 
techniques involving the use of various restriction enzymes which make sequence specific cuts in DNA to 
produce blunt ends or cohesive ends. DNA ligases. techniques enabling enzymatic addition of sticky ends 

50 to blunt-ended DNA, construction of synthetic DNAs by assembly of short or medium length 
oligonucleotides. cDNA synthesis techniques, and synthetic probes for isolating immunoglobulin or other 
bioactive protein genes. Various promoter sequences and other regulatory DNA sequences used in 
achieving expression, and various types of host cells are also known and available. Conventional transfec- 
tion techniques, and equally conventional techniques for cloning and subcloning DNA are useful in th 

55 practice of this invention and known to those skilled in the art. Various types of vectors may be used such 
as plasmids and viruses including animal viruses and bacteriophages. The vectors may exploit various 
marker genes which impart to a successfully transfected cell a detectable phenotypic property that can be 
used to identify which of a family of clones has successfully incorporated the recombinant DNA of the 
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vector. 

One method for obtaining DNA encoding the proteins disclosed herein is by assembly of synthetic 
oligonucleotides produced in a conventional, automated, polynucleotide synthesizer followed by ligation with 
appropriate ligases. For example, overlapping, complementary DNA fragments comprising 15 bases may be 

5 synthesized semi manually using phosphoramidite chemistry, with end segments left unphosphorylated to 
prevent polymerization during ligation. One end of the synthetic DNA is left with a "sticky end** correspond- 
ing to the site of action of a particular restriction endonuclease. and the other end is left with an end 
corresponding to the site of action of another restriction endonuclease. Alternatively, this approach can be 
"fully automated. The DNA encoding the "protein- mdy be -created by syrtthesizin9_ longer -single strand* 

JO fragments (e.g.. 50-100 nucleotides long) in. for example, a Biosearch oligonucleotide synthesizer, and then 
ligating the fragments. 

A method of producing the BABS of the invention is to produce a synthetic DNA encoding a 
polypeptide comprising, e.g., human FRs, and intervening "dummy" CDRs. or amino acids having no 
function except to define suitably situated unique restriction sites. This synthetic DNA is then altered by 

IS DNA replacement, in which restriction and ligation is employed to insert synthetic oligonucleotides encoding 
CDRs defining a desired binding specificity in the proper location between the FRs. This approach 
facilitates empirical refinement of the binding properties of the BABS. 

This technique is dependent upon the ability to cleave a DNA corresponding in structure to a variable 
domain gene at specific sites flanking nucleotide sequences encoding CDRs. These restriction sites in 

20 some cases may be found in the native gene. Alternatively, non-native restriction sites may be engineered 
into the nucleotide sequence resulting in a synthetic gene with a different sequence of nucleotides than th 
native gene, but encoding the same variable region amino acids because of the degeneracy of the genetic 
code. The fragments resulting from endonuclease digestion, and comprising FR-encoding sequences, are 
then ligated to non-native CDR-encoding sequences to produce a synthetic variable domain gene with 

25 altered antigen binding specificity. Additional nucleotide sequences encoding, for example, constant region 
amino acids or a bioactive molecule may then be linked to the gene sequences to produce a bifunctional 
protein. 

The expression of these synthetic DNA's can be achieved in both prokaryotic and eucaryotic systems 
via transfection with an appropriate vector. In E. coli and other microbial hosts, the synthetic genes can be 

30 expressed as fusion protein which is subsequently cleaved. Expression in eucaryotes can be accomplished 
by the transfection of DNA sequences encoding CDR and FR region amino acids and the amino acids 
defining a second function into a myeloma or other type of cell line. By this strategy intact hybrid antibody 
molecules having hybrid Fv regions and various bioactive proteins including a biosynthetic binding site may 
be produced. For fusion protein expressed in bacteria, subsequent proteolytic cleavage of the isolated 

35 fusions can be performed to yield free BABS. which can be renatured to obtain an intact biosynthetic, 
hybrid' antibody binding site. 

- Heretofore, it has not been possible to cleave the heavy and light chain region to separate the variable 
and constant regions of an immunoglobulin so as to produce intact Fv. except in specific cases not of 
commercial utility. However, one method of producing BABS in accordance with this invention is to 

40 redesign DNAs encoding the heavy and light chains of an immunoglobulin, optionally altering its specificity 
or humanizing its FRs, and incorporating a cleavage site and "hinge region" between the variable and 
constant regions of both the heavy and light chains. Such chimeric antibodies can be produced in 
transfectomas or the like and subsequently cleaved using a preselected endopeptidase. 

The hinge region is a sequence of amino acids which serve to promote efficient cleavage by a 

J5 preselected cleavage agent at a preselected, built-in cleavage site. It is designed to promote cleavage 
preferentially at the cleavage site when the polypeptide is treated with the cleavage agent in an appropriate 
environment. 

The hinge region can take many different forms. Its design involves selection of amino acid residues 
(and a DNA fragment encoding them) which impart to the region of the fused protein at>out the cleavag 

50 site an appropriate polarity, charge distribution, and stereochemistry which, in the aqueous environment 
where the cleavage takes place, efficiently exposes the cleavage site to the cleavage agent in preference to 
other potential cleavage sites that may be present in the polypeptide, and/or to improve the kinetics of th 
cleavage reaction. In specific cases, the amino acids of the hinge are selected and assembled in sequence 
based on their known properties, and then the fused polyF>eptide sequence is expressed, tested, and 

55 altered for refinement. 

The hinge region is free of cysteine. This enables the cleavage reaction to be conducted under 
conditions in which the protein assumes its tertiary conformation, and may be held in this conformation by 
intramolecular disulfide bonds. It has been discovered that in these conditions access of the protease to 
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potential cleavage sites which may be present within the targ t prot in is hind red. The hinge region may 
comprise an amino acid sequence which includ s one or more proline residu s. This allows formation of a 
substantially unfolded molecular segment. Aspartic acid, glutamic acid, arginine. lysine, serine, and 
threonin r sidues maximize ionic interactions and may be present in amounts and or in sequence which 

5 renders the moiety comprising the hinge water soluble. 

The cleavage site preferably, is immediately adjacent the Fv polypeptide chains and composes one 
amino acid or a sequence of amino acids exclusive of any sequence found in the amino acid structure of 
the chains in the Fv. The cleavage site preferably is designed for unique or preferential cleavage by a 
specific selected 'agent. Endopeptidase's' are preferred, although'non-enzymatic (chemical)* cleavage agents" 

w may be used. Many useful cleavage agents, for instance, cyanogen bromide, dilute acid, trypsin. Staphy- 
lococcus aureus V-8 protease, post proline cleaving enzyme, blood coagulation Factor Xa. enterokinase. 
and renin, recognize and preferentially or exclusively cleave particular cleavage sites. One currently 
preferred cleavage agent is V-8 protease. The currently preferred cleavage site is a Glu residue. Other 
useful enzymes recognize multiple residues as a cleavage site. e.g.. factor Xa (lle-Glu-Gly-Arg) or 

/5 enterokinase (Asp-Asp-Asp-Asp-Lys). The principles of this selective cleavage approach may also be used 
in the design of the linker and spacer sequences of the multifunctional constructs of the invention where an 
exciseable linker or selectively cleavable linker or spacer is desired. 

Design of Synthetic Vnand V t Mimics 

20 

FRs from the heavy and light chain murine anti-digoxin monoclonal 26-10 (Figures 4A and 4B) were 
encoded on the same ONAs with CDRs from the murine anti-lysozyme monoclonal gip-4 heavy chain 
(Figure 3 sequence 1) and tight chain to produce Vh (Figure 4C) and V,, (Figure 4D) regions together 
defining a biosynthetic antibody binding site which is specific for lysozyme. Murine CDRs from both the 

25 heavy and light chains of monoclonal glp-4 were encoded on the same DNAs with FRs from the heavy and 
light chains of human myeloma antibody NEWM (Figures 4E and 4F). The resulting interspecies chimeric 
antibody binding domain has reduced immunogenicity in humans because of its human FRs. and specificity 
for lysozyme because of its murine CDRs. 

A synthetic DNA was designed to facilitate CDR insertions into a human heavy chain FR and to 

30 facilitate empirical refinement of the resulting chimeric amino acid sequence. This DNA is depicted in 
Figure 5. 

A synthetic, bifunctional FB-binding site protein was also designed at the DNA level, expressed, 
purified, renatured. and shown to bind specifically with a preselected antigen (digoxin) and Fc. The detailed 
primary structure of this construct is shown in Figure 6: its tertiary structure is illustrated schematically in 
35 Figure 28. 

• Details of these and other experiments, and additional design principles on which the invention is 
based, are set forth below. 

GENE DESIGN AND EXPRESSION 

40 

Given known variable region DNA sequences, synthetic and Vh genes may be designed which 
encode native or near native FR and CDR amino acid sequences from an antibody molecule, each 
separated by unique restriction sites located as close to FR-CDR and CDR-FR borders as possible. 
Alternatively, genes may be designed which encode native FR sequences which are similar or identical to 

45 the FRs of an antibody molecule from a selected species, each separated by "dummy" CDR sequences 
containing strategically located restriction sites. These DNAs serve as starting materials for producing 
SABS, as the native or "dummy" CDR sequences may be excised and replaced with sequences encoding 
the CDR amino acids defining a selected binding site. Alternatively, one may design and directly synthesize 
native or near-native FR sequences from a first antibody molecule, and CDR sequences from a second 

50 antibody molecule. Any one of the Vh and Vl sequences described above may be linked together directly, 
via an amino acids chain or linker connecting the C-terminus of one chain with the N-terminus of the other. 

These genes, once synthesized, may be cloned with or without additional DNA sequences coding for. 
e.g.. an antibody constant region, enzyme, or toxin, or a leader peptide which facilitates secretion or 
intracellular stability of a fusion polypeptide. The genes then can be expressed directly in an appropriate 

55 host cell, or can be further engineered before expression by the exchange of FR. CDR. or "dummy" CDR 
sequences with new sequences. This manipulation is facilitat d by the presence of the restriction sites 
which have been engineered into the gene at the FR-CDR and CDR-FR borders. 



13 



EP 0 318 554 B1 



Figure 3 illustrates the gen ral approach to designing a chimeric Vh: further details of exemplary 
designs at he DNA level are shown in Figures 4A-4F. Figure 3. lines 1 and 2. show the amino acid 
s quences of th h avy chain variable region of the murine monoctonals glp-4 (anti-lysozyme) and 26-10 
(anti-digoxin). including the four FR and three CDR sequences of each. Line 3 shows the sequence of a 
5 chimeric Vh which comprises 26-10 FRs and 9lp-4 CDRs, As illustrated, the hybrid protein of line 3 is 
identical to the native protein of line 2. except that 1) the sequence TFTNYYIHWLK has replaced the 
sequence IFTOFYMNWVR. 2) EWIGWIYPGNGNTKYNENFKG has' replaced DYIGYISPYSGVTGYNQKFKG. 
3) RYTHYYF has replaced GSSGNKWAM. and 4) A has replaced V as the sixth amino acid beyond CDR-2. 
• * * These thahges hdve the effect of changing the specificity of'fhe 26-ro Vh* to" mimic the specificity of glp-4/ 

10 The Ala to Val single amino acid replacement within the relatively conserved framework region of 26-10 is 
an example of the replacement of an amino acid outside the hypervariable region made for the purpose of 
altering specificity by CDR replacement. Beneath sequence 3 of Figure 3. the restriction sites in the DNA 
encoding the chimeric Vh (see Figures 4A-4F) are shown which are disposed about the CDR-FR borders. 
Lines 4 and 5 of Rgure 3 represent another construct. Line 4 is the full length Vh of the human antibody 

75 NEWM. That human antibody may be made specific for lysozyme by CDR replacement as shown in line 5. 
Thus, for example, the segment TFTNYYIHWLK from glp-4 replaces TFSNDYYTWVR of NEWfvt. and its 
other CDRs are replaced as shown. This results in a Vh comprising a human framework with murine 
sequences determining specificity. 

By sequencing any antibody, or obtaining the sequence from the literature, in view of this disclosure 

20 one skilled in the art can produce a SABS of any desired specificity comprising any desired framework 
region. Diagrams such as Figure 3 comparing the amino acid sequence are valuable in suggesting which 
particular amino acids should be replaced to determine the desired complementarity. Expressed sequences 
may be tested for binding and refined by exchanging selected amino acids in relatively conserved regions, 
based on observation of trends in amino acid sequence data and/or computer modeling techniques. 

25 -Significant flexibility in Vh and V^ design is possible because the amino acid sequences are determined 
at the DNA level, and the manipulation of DNA can be accomplished easily. 

For example, the DNA sequence for murine Vh and Vl 26-10 containing specific restriction sites flanking 
each of the three CDRs was designed with the aid of a commercially available computer program which 
performs combined reverse translation and restriction site searches {"RV.exe" by Compugene, Inc.). The 

30 known amino acid sequences for Vh and Vl 26-10 polypeptides were entered, and all potential DNA 
sequences which encode those peptides and all potential restriction sites were analyzed by the program. 
The program can, in addition, select DNA sequences encoding the peptide using only codons preferred by 
E. coli if this bacterium is to be host expression organism of choice. Figures 4A and 4B show an example of 
program output. The nucelic acid sequences of the synthetic gene and the corresponding amino acids are 

35 showri. Sites of restriction endonuciease cleavage are also indicated. The CDRs of these synthetic genes 
are underlined. 

The DNA sequences for the synthetic 26-10 Vh and Vl are designed so that one or both of the 
restriction sites flanking each of the three CDRs are unique. A six base site (such as that recognized by 
Bsm I or BspM i) is preferred, but where six base sites are not possible, four or five base sites are used. 

40 These sites, if not already unique, are rendered unique within the gene by eliminating other occurrences 
within the gene without altering necessary amino acid sequences. Preferred cleavage sites are those that, 
once cleaved, yield fragments with sticky ends just outside of the boundary of the CDR within th 
framework. However, such ideal sites are only occasionally possible because the FR-CDR boundary is not 
an absolute one, and because the amino acid sequence of the FR may not permit a restriction site. In these 

45 cases, flanking sites in the FR which are more distant from the predicted boundary are selected. 

Rgure 5 discloses the nucleotide and corresponding amino acid sequence (shown in standard single 
letter code) of a synthetic DNA comprising a master framework gene having the generic structure: 

Rt -FRi -X, •FR2-X2-FR3-X3-FR4 -R? 
where Ri and R? are restricted ends which are to be ligated into a vector, and Xi, X?, and X3 are DNA 

50 sequences whose function is to provide convenient restriction sites for CDR insertion. This particular DNA 
has murine FR sequences and unique. 6-base restriction sites adjacent the FR borders so that nucleotide 
sequences encoding CDRs from a desired monoclonal can be inserted easily. Restriction endonuciease 
digestion sites are indicated with their abbreviations; enzymes of choice for CDR replacement are 
underscored. Digestion of the gene with the following restriction endonucleases results in 3* and 5' ends 

55 which can easily be matched up with and ligated to native or synthetic CDRs of desired specificity: Kpnl 
and BstXI are used for ligation of CDR:: Xbal and Dral for CDR?; and BssHII and Clal for CDR3. 
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OLIGONUCLEOTIDE SYNTHESIS 

Th synthetic genes and DNA fragments designed as described above preferably are produced by 
assembly of ch mically synthesized oligonucleotides. 15-tOOm r oligonucleotides may be synthesized on a 
Biosearch DNA f^odel 8600 Synthesizer, and purified by polyacrylamide gel electrophoresis (PAGE) in Tris- 
Borate-EDTA buffer (TBE). The DNA is then electroeluted from the gel. Overlapping oligomers may be 
phosphorylat d by T4 polynucleotide kinas and ligated into larger blocks which may also be purified bv 
PAGE. 



CLONING OF SYNTHETIC OLIGONUCLEOTIDES 

The blocks or the pairs of longer oligonucleotides may be cloned into E. coli using a suitable, e.g., pUC. 
cloning vector. Initially, this vector may be altered by single strand mutagenesis to eliminate residual six 
base altered sites. For example. Vh may be synthesized and cloned into pUC as five primary blocks 
spanning the following restriction sites: 1. EcoRI to first Narl site: 2. first Narl to Xbal: 3. Xbal to Sail: 4. Sail 
to Ncol; 5. Ncol to BamHI. These cloned fragments may then be isolated and assembled in several three- 
fragment ligations and cloning steps into the pUC8 plasmid. Desired ligations selected by PAGE are then 
transformed into, for example. E. coli strain JM83. and plated onto LB Ampicillin + Xgal plates according to 
standard procedures. The gene sequence may be confirmed by supercoil sequencing after cloning, or after 
sutxioning into M 13 via the dideoxy method of Sanger 

PRINCIPLE OF CDR EXCHANGE 

Three CDRs (or alternatively, four FRs) can be replaced per Vh or V^. In simple cases, this can be 
accomplished by cutting the shuttle pUC plasmid containing the respective genes at the two unique 
restriction sites flanking each CDR or FR. removing the excised sequence, and ligating the vector with a 
native nucleic acid sequence or a synthetic oligonucleotide encoding the desired CDR or FR. This three 
part procedure would have to be repeated three times for total CDR replacement and four times for total FR 
replacement. Alternatively, a synthetic nucleotide encoding two consecutive CDRs separated by the 
appropriate FR can be ligated to a pUC or other plasmid containing a gene whose corresponding CDRs and 
FR have been cleaved out. This procedure reduces the number of steps required to perform CDR and/or 
FR exchange. 

EXPRESSION OF PROTEINS 

'The engineered genes can be expressed in appropriate prokaryotic hosts such as various strains of E. 
coli, and in eucaryotic hosts such as Chinese hamster ovary cell, murine myeloma, and human 
myeloma'transfectoma cells. 

For example, if the gene is to be expressed in E coli. it may first be cloned into an expression vector. 
This is accomplist^ed by positioning the engineered gene downstream from a promoter sequence such as 
trp or tac, and a gene coding for a leader peptide. The resulting expressed fusion protein accumulates in 
retractile bodies in the cytoplasm of the cells, and may be harvested after disruption of the cells by French 
press or sonicatron. The retractile txxJies are solubilized. and the expressed proteins refolded and cleaved 
by the methods already established for many other recombinant proteins 

If the engineered gene is to t>e expressed in myeloma cells, the conventional expression system for 
immunoglobulins, it is first inserted into an expression vector containing, for example, the Ig promoter, a 
secretion signal, immunoglobulin enhancers, and various introns. This plasmid may also contain sequences 
encoding all or part of a constant region, enabling an entire part of a heavy or light chain to be expressed. 
The gene is transfected into myeloma cells via established electroporation or protoplast fusion methods. 
Cells so transfected can express or Vh fragments, Vl? or Vh? homodimers. Vl-Vh heterodimers. Vh-Vi. or 
Vl-Vh single chain polypeptides, complete heavy or light immunoglobulin chains, or portions thereof, each 
of which may be attached in the various ways discussed above to a protein region having another function 
(e.g., cytotoxicity). 

Vectors containing a heavy chain V region (or V and C regions) can be cotransfected with analogous 
vectors carrying a light chain V region (or V and C regions), allowing for the expression of noncovalently 
associated binding sites (or complete antibody molecules). 

In the examples which follow, a specific example of how to make a single chain binding site is 
disclosed, together with methods employed to assess its binding properties. Thereafter, a protein construct 
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having two functional domains is disclosed. Lastly, there is disclosed a series of additional targeted proteins 
which ex mplify the invention. 

I EXAMPLE OF CDR E X CHANGE AND EXPRESSION 

5 

The synth tic gene coding for nnurine Vw and 26-10 shown in Figures 4A and 48 were designed 
from the known amino acid sequenc of the protein with the aid of Compugene. a software program. These 
genes, although coding for the native amino acid sequences, also contain non-native and often unique 
* 'restriction 'sites flanking " nucleic acid sequences encoding CDR's to facilitate" CDR -replacement as 'noted 
10 above. 

Both the 3' and 5' ends of the large synthetic oligomers were. designed to include 6-base restnction 
sites, present in the genes and the pUC vector. Furthermore, those restriction sites in the synthetic genes 
which were only suited for assembly hut not for cloning the pUC were extended by "helper" cloning sites 
with matching sites in pUC. 

/5 Cloning of the synthetic DNA and later assembly of the gene is facilitated by the spacing of unique 
restriction sites along the gene. This allows corrections and modifications by cassette mutagenesis at any 
location. Among them are alterations near the 5' or 3' ends of the gene as needed for the adaptation to 
different expression vectors. For example, a PstI site is positioned near the 5* end of the Vh gene. Synthetic 
linkers can be attached easily between this site and a restriction site in the expression plasmid. These 

20 genes were synthesized by assembling oligonucleotides as described above using a Biosearch f^odel 8600 
DNA Synthesizer. They were ligated to vector pUC8 for transformation of E coli. 

Specific CDRs may be cleaved from the synthetic Vh gene by digestion with the following pairs of 
restriction endonucleases: HpHI and BstXI for CDRt: Xbal and Dral for CDR2: and Banll and BanI for CDR3. 
After removal on one CDR. another CDR of desired specificity may be ligated directly into the restricted 

25 gene, in its place if the 3' and 5* ends of the restricted gene and the new CDR contain complementary 
single stranded DNA sequences. 

In the present example, the three CDRs of each of murine Vh 26-10 and V^ 26-10 were replaced with 
the corresponding CDRs of glp-4. The nucleic acid sequences and corresponding amino acid sequences of 
the chimeric Vh and Vt genes encoding the FRs of 26-10 and CDRs of glp-4 are shown in Figures 4C and 

30 4D. The positions of the restriction endonuclease cleavage sites are noted with their standard abbreviations. 
CDR sequences are underlined as are the restriction endonucleases of choice useful for further COR 
replacement. 

These genes were cloned into pUC8, a shuttle plasmid. To retain unique restriction sites after cloning, 
the Vn-like gene was spliced into the EcoRI and Hindlll or BamHI sites of the plasmid. 

05 Direct expression of the genes may be achieved in E. coli. Alternatively, the gene may be preceded by 
a leader sequence and expressed in E. coli as a fusion product by splicing the fusion gene into the host 
gene whose expression is regulated by interaction of a repressor with the respective operator. The protein 
can be induced by starvation in minimal medium and by chemical inducers. The Vh-Vl biosynthetic 26-10 
gene has been expressed as such a fusion protein behind the trp and tac promoters. The gene translation 

40 product of interest may then t>e cteaved from the leader In the fusion protein by e.g., cyanogen bromide 
degradation, tryptic digestion, miid acid cleavage, and/or digestion with factor Xa protease. Therefore, a 
shuttle plasmid containing a synthetic gene encoding a leader peptide having a site for mild acid cleavage, 
and into which has been spliced the synthetic BABS gene was used for this purpose. In addition, synthetic 
DNA sequences encoding a signal peptide for secretion of the processed target protein into the periplasm 

45 of the host cell can also be incorporated into the plasmid. 

After harvesting the gene product and optionally releasing it from a fusion peptide, its activity as an 
antibody binding site and its specificity for glp-4 (lysozyme) epitope are assayed by established im- 
munological techniques, e.g.. affinity chromatography and radioimmunoassay. Correct folding of the protein 
to yield the proper three-dimensional conformation of the antibody binding site is prerequisite for its activity. 

50 This occurs spontaneously in a host such as a myeloma cell which naturally expresses immunoglobulin 
proteins. Alternatively, for bacterial expression, the protein forms inclusion bodies which, after harvesting, 
must be subjected to a specific sequence of solvent conditions (e.g., diluted 20 X from 8 M urea 0.1 hA Tris- 
HCI pH 9 into 0.15 M NaCI, 0.01 M sodium phosphate, pH 7.4 (Hochman et al. (1976) Biochem. 15:2706- 
2710) to assume its connect conformation and hence its active form. ~ 

55 Figures 4E and 4F show the DNA and amino acid sequence of chimeric Vh and Vl comprising human 
FRs from NEWrvi and murine CDRs from glp-4. The CDRs are underiined. as are restriction sites of choice 
for further CDR replacement or empirically determined refinement. 
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These constructs also constitute master fram work g nes. this time constructed of human framework 
sequences. They may t>e used to construct BABS of any desir d specificity by appropriate CDR 
replacement. 

Binding sit s with other specificities have also been designed using the m thodologies disclosed 
5 herein. Examples include those having FRs from the human NEWM antibody and CDRs from murine 26-10 
(Figure 9A), murine 26-10 FRs and G-loop CDRs (Figur 9B). FRs and CDRs from murine fy/10PC-3l5 
(Figure 9C). FRs and CDRs front an anti-human carcino mbryonic antig n monoclonal antibody (Figure 
9D). and FRs and CDRs 1. 2. and 3 from Vj. and FRs and CDR 1 and 3 from the Vh of the anti-CEA 

• antibody. -with CDR 2 from" a consensus immunpglobuHn" gene (Figure 9E). • * 

JO ' 
W. Model Binding Site: 

The digoxin binding site of the IgGzaM monoclonal antibody 26-10 has been analyzed by Mudgett- 
Hunter and colleagues (unpublished). The 26-10 V region sequences were determined from both amino 
/5 acid sequencing and DNA sequencing of 26-10 H and L chain mRNA transcripts (D. Panka. J.N. & M.N.M.. 
unpublished data). The 26-10 antibody exhibits a high digoxin binding affinity [Ko = 5.4 X 10^ fvl""'] and has 
a we!l<lefined specificity profile, providing a baseline for comparison with the biosynthetic binding sites 
mimicking its structure. 

20 Protein Design : 

Crystallographically determined atomic coordinates for Fab fragments of 26-10 were obtained from the 
Brookhaven Data Bank. Inspection of the available three-dimensional structures of Fv regions within their 
parent Fab fragments indicated that the Euclidean distance between the C-terminus of the Vh domain and 

25 the N-terminus of the domain is about 35 A. Considering that the peptide unit length is approximately 3.8 
A. a 15 residue linker was selected to bridge this gap. The linker was designed so as to exhibit little 
propensity for secondary structure and not to interfere with domain folding. Thus, the 15 residue sequence 
(Gly-Gly-Gly-Gly-Ser)3 was selected to connect the Vh carboxyt- and Vt amino-termini. 

Binding studies with single chain binding sites having less than or greater than 15 residues demonstrate 

30 the importance of the prerequisite distance which must separate Vh from V,.: for example, a (Gly4-Ser)i 
linker does not demonstrate binding activity, and those with (Glyi-Ser)5 linkers exhibit very low activity 
compared to those with (Gly*-Ser)3 linkers. 

Gene Synthesis : 

35 

Design of the 744 base sequence for the synthetic binding site gene was derived from the Fv protein 
sequence of 26-10 by choosing codons frequently used in |. coli. The model of this representative 
synthetic gene is shown in Figure 8. discussed previously- Synthetic genes coding for the trp promoter- 
operator, the modified trp LE leader peptide (MLE). the sequence of which is shown in FigureToA, and Vh 

40 were prepared largely as descril>ed previously. The gene coding for Vh was assembled from 46 chemically 
synthesized oligonucleotides, all 15 bases long, except for terminal fragments (13 to 19 bases) that included 
cohesive cloning ends. Between 8 and 15 overlapping oligonucleotides were enzymatically ligated into 
double stranded DNA. cut at restriction sites suitable for cloning (Narl, Xbal, Sail, Sacll, Sad), purified by 
PAGE on 8% gels, and cloned in pUC which was modified to contain additional cloning sites in the 

45 polylinker. The cloned segments were assembled stepwise into the complete gene mimicking Vh by 
ligations in the pUC cloning vector. 

The gene mimicking 26-10 Vl was assembled from 12 long synthetic polynucleotides ranging in size 
from 33 to 88 base pairs, prepared in automated DNA synthesizers (Model 6500, Biosearch. San Rafael. 
CA; Model 380A, Applied Biosystems. Foster City. CA). Five individual double stranded segments were 

50 made out of pairs of long synthetic oligonucleotides spanning six-base restriction sites in the gene (Aatll. 
BstEII. PpnI. Hindlll. Bgllt, and PstI). In one case, four long overlapping strands were combined and cloned. 
Gene fragments bounded by restriction sites for assembly that were absent from the pUC polylinker, such 
as Aatll and BstEII. were flanked by EcoRI and BamHI ends to facilitate cloning. 

The linker between Vh and Vl. encoding (Gly-Gly-Gly-Gly-Ser)3. was cloned from two long synthetic 

55 oligonucleotides. 54 and 62 bases long, spanning Sad and Aatll sites, the latter followed by an EcoRI 
cloning end. The complete single chain binding site gene was assembled from the Vh. V^. and linker genes 
to produce a construct, corresponding to aspartyl-prolyly-VH-<linker>-VL. flanked by EcoRI and Pstl restric- 
tion sites. 
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Th trp promoter-operator, starting from its Sspl site, was ass mbled from 12 overlapping 15 base 
oligom rs. and the MLE leader gene was assembled from 24 overlapping 15 base oligomers. These were 
cloned and assembled in pUC using the strategy of assembly sites flanked by cloning sites. The final 
expression plasmid was constructed in the pBR322 vector by a 3-part ligation using the sites Sspl. EcoRI. 
5 and Psti (see Figure 108). Intermediate DNA fragments and assembled genes were sequenced by the 
dideoxy method. 

Fusion Protein Expression : 

10 Single-chain protein was expressed as a fusion protein. The MLE leader gene (Fig. lOA) was derived 
from E. coli trp LE sequence and expressed under the control of a synthetic trp promoter and operator. E. 
coli strain JM83 was transformed with the expression plasmid and protein expression was induced in M9 
minimal medium by addition of indoleacrylic acid (10 ug/ml) at a cell density with A&oc = 1. The high 
expression levels of the fusion protein resulted in its accumulation as insoluble protein granules, which were 

15 harvested from cell paste (Figure 11. Lane 1). 

Fusion Protein Cleavage : 

The MLE leader was removed from the binding site protein by acid cleavage of the Asp-Pro peptid 
20 bond engineered at the junction of the MLE and binding site sequences. The washed protein granules 
containing the fusion protein were cleaved in 6 M guanidine-HCI + 10% acetic acid, pH 2.5. incubated at 
37 'C for 96 hrs. The reaction was stopped through precipitation by addition of a 10-fold excess of ethanol 
with overnight incubation at -20 'C. followed by centrifugation and storage at -20 'C until further purification 
(Figure 1 1 . Lane 2). 

25 

Protein Purification : 

The acid cleaved binding site was separated from remaining intact fused protein species by chromatog- 
raphy on DEAE® cellulose. The precipitate obtained from the cleavage mixture was redissolved in 6 M 

30 guanidine-HCI + 0.2 M Tris-HCI. pH 8.2, + 0.1 M 2-mercaptoethanol and dialyzed exhaustively against 6 
M urea *■ 2.5 mM Tris-HCI. pH 7.5. + 1 mM EDTA. 2-Mercaptoethanol was added to a final concentration 
of 0.1 M, the solution was incubated for 2 hrs at room temperature and loaded onto a 2.5 X 45 cm column 
of DEAE cellulose (Whatman DE 52). equilibrated with 6 M urea 2.5 mM Tris-HCI + 1 mM EDTA. pH 
7.5. The intact fusion protein bound weakly to the DE 52 column such that its elution was retarded relativ 

35 to that of the binding protein. The first protein fractions which eluted from the column after loading and 
washing with urea buffer contained BABS protein devoid of intact fusion protein. Later fractions contami- 
nated with some fused protein were pooled, rechromatographed on DE 52. and recovered single chain 
binding protein combined with other purified protein into a single pool (Figure 1 1 , Lane 3), 

40 Refolding : 

The 26-10 binding site mimic was refolded as follows: the DE 52 pooL disposed in 6 M urea 2.5 mM 
Tris-HCI + 1 mM EDTA. was adjusted to pH 8 and reduced with 0.1 M 2-mercaptoelhanol at 37 'C for 90 
min. This was diluted at least 100-fold with 0,01 M sodium acetate. pH 5.5. to a concentration below 10 
45 ug/ml and dialyzed at 4 * C for 2 days against acetate buffer. 

Affinity Chromatography : 

Purification of active binding protein by affinity chromatography at 4'C on a ouabain-amine- 
50 Sepharose® column was performed. The dilute solution of refolded protein was loaded directly onto a pair 
of tandem columns, each containing 3 ml of resin equilibrated with the 0.01 M acetate buffer. pH 5.5. The 
columns were washed individually with an excess of the acetate buffer, and then by sequential additions of 
5 ml each of 1 M NaCI, 20 mM ouabain, and 3 M potassium thiocyanate dissolved in the acetate buffer, 
interspersed with acetate buffer washes. Since digoxin binding activity was still present in the eluate. th 
55 eluate was pooled and concentrated 20-fold by ultrafiltration (PM 10 membrane, 200 ml concentrator; 
Amicon), reapplied to the affinity columns, and eluted as described. Fractions with significant absorbance at 
280 nm were pooled and dialyzed against PBSA or the above acetate buffer. The amounts of protein in th 
DE 52 and ouabain-Sepharose® pools were quantitated by amino acid analysis following dialysis against 
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0.01 M acetate buffer. The results are shown below in TabI i. 



TABLE 1 





Estimated Yields of BABS Protein Dunng Purification 


Step 


Wet wt. Per 1 


mg protein 


Cleavage yield (%) pnor step 


Yield relative to 
fusion prptein 


Cell paste 

Fusion protein Granules 
Acid Cleavage/ DE 52 pool 
Ouabain-Sepharose pool 


12.0 g 
2-3 g 


1440.0 mg* 
480.0 mg^-^ 
144.0 mg 
18.1 mg 


100.0% 
38.0* 
la-S** 


100.0% 
38.0* 
4.7* 



MO 
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'^Determined by absorbance measurements 
*^Determined by amino acid analysis 

If Sff 1 T T °' ^""^^ '^""^'^ '^'^ ouabain-Sepharose® relative 

to that applied to the resin; values were determined by amino acid analysis 
'Percentage yield calculated on a molar basis 



Sequence Analysis of Gene and Protein: 



rr>Jrr^J°,y sequenced in both directions using the dideoxy method of Sanger which 

confirmed the gene was correctly assembled. The protein sequence was also verified by protein seTuTn- 

mat XT °" <'«*^"e3 1-40). as weS as^ two 

major CNBr fragments (residues 108-129 and 140-159) with a Model 470A r,,, nh«co . ! 
With a Mode. 120A on-.ine phenylthiohydantoin-amino Ld an^r ^L^^^^^^^^^^^ 
Homogeneous binding protein fractionated by SDS-PAGE and eluted from gel strips ^h water iSTr^Spn' 
^th a 20.000-fold excess of CNBr. in trifluoroacetic acid-acetoni.n.e it for T hTs a ' 25 n e 
Z1J> '"^^^^'^^ S^S-f^^GE and transferred electrophte'ca'ly omo an 

Immobilon membrane (Mil.ipore. Bedford. MA). from which stained bands were cu, ou, and seJueLd 



35 



Specificity Determination: 



Specificities of anti-digoxin 26-10 Fab and the BABS were assessed by radioimmunoassav Wells of 
ur Taf ru:rin%?SA^''' affimty-purified goat anti-murine Fab fragment (.CN ^munoL^i al 
Lisie. IL) at 10 ug/ml in PBSA overnight at 4-C. After the plates were washed and blocked with 1". hors« 
serum in PBSA. solutions (50 ul) containing 26-10 Fab or the BABS in e^er PBSrofoo m 
acernte at pH 5.5 were added to the wells and incubated 2-3 hrs at 00?^^:; "^ft^e^^^ 

n-Mo "ir;Bir '""^ 'iz^v ^' ^ '^'^^ ^' -cemraLTrcidir c 
« ^^z^z Son-:- r r r-n-'ig- ^n^::^^::^ 

d^oxin analogue were calculated by dividing the concentration of each ana^eTsol S o^ 

cTvesT^VBA^S r 'T'T ''^'^ '^ ^ displacemen TinSt on 

BABS tian 26 10 F,h Z t'^°T observed for 26-10 Fab. because less active 

aceSe OH 5 5 mnr^ T^r T '° I "^"^ ° ^ ^'^''^ '° ^ABS in 0.01 1^ sitr^ 

mrRARrnH K , ' """"^ '° 9°"' anti-murine Fab coating on the plate This caused 

?6 10 Fab ihTY""' '° concentrations, close' to the pos. on o hose ,^ 

26-10 Fab. although maintaining the relative positions of curves for sFv obtained in acetate buffer alor^e 

ar:^r;rsf " °' '"•^'""^^ ^'^'"3 of"^!;^or;ini"a 
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TABLE 2 



26-10 Antibody Species 


Nornnalizing Glycoside 


D 


DG 


DO 


DOG 




\J 


0 


Fab 


Digoxin 
Digoxigenin 


1.0 
0.9 


1-2 
1.0 


0.9 
0.8 


1.0 

0.9 


1.3 
1.1 


9.6 
8.1 


15 
13 


BABS 


Digoxin 
. .Digoxigenin 


1.0 
0.1^ 


7.3 
- '0 


2.0 
^.0.3. 


2.6 
0.4, , 


5.9 
08 . 


62 
.8.5 , 


150 
. 21. 



JO 
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D = Digoxin 

DG = Digoxigenin 

DO = Digitoxin 

DOG = Digitoxigenin 

A-S = Acetyl Strophanthidin 

G = Gitoxin 

0 = Ouabain 
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30 



35 



40 



45 



Affinity Determination : 

Association constants were measured oy equilibrium binding studies. In immunoprecipitation experi- 
°' "■^'9°'''" <New England Nuclear. Billerica. MA) at a series of concentrations (10"' M to 
10 M) were added to 100 ul of 26-10 Fab or the BABS at a fixed concentration. After 2-3 hrs of 
-ncubation at room temperature, the protein was precipitated by the addition of 100 ul goat antiserum to 
munne Fab fragment (ICN ImmunoBiologicals). 50 ul of the IgG fraction of rabbit anti-goat IgG (ICN 
lmmunoB.c^og,cals), and 50 ul of a 10% suspension of protein A-Sepharose® (Sigma). Following 2 hrs at 

l^H^M-ir p TT.^^'" '"''"'^"^ °" 9'^" (Vacuum Filtration 

MamfoW. MHhpore. Bedford. MA). Filter disks were then counted in 5 ml of scintillation fluid with a Model 
f '-'^'^"^ Scmtillation Analyzer (Packard. Sterling, VA). The association constants. K„. were 
calculated from Scatchard analyses of the untransformed radioligand binding data using LIGAND a non- 

10^1^7!^ '"^ r°^'"" '''''' °" ^^'^"'^'^^ Sips plots and'bmding 

isotherms Shown m Rgure ISA for the BABS and 13B for the Fab. For binding isotherms, data are plotted 

Zl^r '^'9°'''" bound versus the log of the unbound digoxin concentration, and the 

d ssocia^on constant .s estimated from the ligand concentration at 50% saturation. These binding data are 
also plotted .n linear form as Sips plots (inset), having the same abscissa as the binding isotherm but with 
rmiH f '«P;«^«"«"9.'°9/'<"-^'- de«"e<^ below. The average intrinsic association constant (K,) was 
calculated from the modified Sips equation (39). log (r/n-r) = a log C - a log K„. where r equals moles of 
digoxin bound per mole of antibody at an unbound digoxin concentration equal to C; n is the number of 
moles of digoxin bound at saturation of the antibody binding site, and a is an index of heterogeneity which 
describes the distribution of association constants about the average intrinsic association constant K,. Least 

l ^a^r l?rpivfir'?nl"?'" °' ''^ ''''^ coefficients for the lines obtained were 

a96 for the BABS and 0.99 for 26-10 Fab. A summary of the calculated association constants are shown 
Deiow rn Table 3. 



TABLE 3 



50 



55 



Method of Data Analysis 


Association Constant, Kq 


Ko(BABS). 


Ko (Fab). M" 


Scatchard plot 
Sips plot 
Binding isotherm 


(3.2 i 0.9) X 10' 
2.6 X 10' 
5.2 X 10' 


(1-9 t 0.2) X 108 
1.8X 108 
3.3 X 108 
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111. Synthesis of a Multifunctional Protein 

A nucleic acid sequence encoding the single chain binding site described above was fused with a 
sequence encoding the FB fragment of protein A as a leader to function as a second active region. As a 
5 spacer, the native amino acids comprising the last 11 amino acids of the FB fragment bonded to an Asp- 
Pro dilute acid cleavage site was employed. The FB binding domain of the FB consists of the immediately 
preceding 43 amino acids which assume a helical configuration (see Rg. 2B). 

The gene fragments are synthesized using a Biosearch DNA Model 8600 Synthesizer as described 
above. Synthetic oligonucleotides are cloned according to established protocol described above using the 
10 pUC8 vector transfected into E. coli. The completed fused gene set forth in Figure 6 A is then expressed in 
E. coli. 

After sonication. inclusion bodies were collected by centrifugation, and dissolved in 6 M guanidine 
hydrochloride (GuHCI), 0.2 M Tris, and 0.1 M 2-mercaptoethanol (BME), pH 8.2. The protein was denatured 
and reduced in the solvent overnight at room temperature. Size exclusion chromatography was used to 

15 purify fusion protein from the inclusion bodies. A Sepharose 4B column (1 .5 X 80 cm) was run in a solvent 
of 6 M GuHCI and 0.01 M NaOAc, pH 4.75. The protein solution was applied to the column at room 
temperature in 0.5-1.0 ml amounts. Fractions were collected and precipitated with cold ethanol. These were 
run on SDS gels, and fractions rich in the recombinant protein (approximately 34,000 D) were pooled. This 
offers a simple first step for cleaning up inclusion body preparations without suffering significant proteolytic 

20 degradation. 

For refolding, the protein was dialyzed against 100 ml of the same GuHCI-Tris-BME solution, and 
dialysate was diluted 11 -fold over two days to 0.55 M GuHCI, 0.01 M Tris, and 0.01 M BME. The dialysis 
sacks were then transferred to 0.01 M NaCI, and the protein was dialyzed exhaustively before being 
assayed by RIA*s for binding of ^25|-iabeiled digoxin. The refolding procedure can be simplified by making a 
25 rapid dilution with water to reduce the GuHCI concentration to 1.1 M, and then dialyzing against phosphate 
buffered saline (0.15 M NaCI, 0.05 M potassium phosphate, pH 7. containing 0.03% NaNa). so that it is free 
of any GuHCI within 12 hours. Product of both types of preparation showed binding activity, as indicated in 
Figure 7A. 

30 Demonstration of Bifunctionality : 

This protein with an FB leader and a fused BABS is bifunctional; the BABS can bind the antigen and 
the FB can bind the Fc regions of immunoglobulins. To demonstrate this dual and simulataneous activity 
several radioimmunoassays were performed. 

35 Properties of the binding site were probed by a modification of an assay developed by Mudgett-Hunter 
et al. (J. Immunol. (1982) 129:1165-1172; Molec. Immunol. (1985) 22:477-488), so that it could be run on 
microtiter plates as a solid phase sandwich assay. Binding data were collected using goat anti-murine Fab 
antisera (gAmFab) as the primary antibody that initially coats the wells of the plate. These are polyclonal 
antisera which recognize epitopes that appear to reside mostly on framework regions. The samples of 

40 interest are next added to the coated wells and incubated with the gAmFab, which binds species that 
exhibit appropriate antigenic sites. After washing away unbound protein, the wells are exposed to ^^1- 
labelled (radioiodinated) digoxin conjugates, either as ^^si-dig-BSA or ^^si-dig-iysine. 

The data are plotted in Figure 7A, which shows the results of a dilution curve experiment in which the 
parent 26-10 antibody was included as a control. The sites were probed with i-^jg-BSA as described 

45 above, with a series of dilutions prepared from initial stock solutions, including both the slowly refolded (1) 
and fast diluted/quickly refolded (2) single chain proteins. The parallelism between all three dilution curves 
indicates that gAmFab binding regions on the BABS molecule are essentially the same as on the Fv of 
authentic 26-10 antibody, i.e.. the surface epitopes appear to be the same for both proteins. 

The sensitivity of these assays is such that binding affinity of the Fv for digoxin must be at least 10^. 

50 Experimental data on digoxin binding yielded binding constants in the range of 10^ to 10^ M"\ The parent 
26-10 antibody has an affinity of 5.4 X 10^ M~\ Inhibition assays also indicate the binding of '^si-dig-iysine, 
and can be inhibited by unlabelled digoxin, digoxigenin, digitoxin, digitoxigenin. gitoxin, acetyl strophan- 
thidin, and ouabain in a way largely parallel to the parent 26-10 Fab. This indicates that the specificity of the 
biosynthetic protein is substantially identical to the original monoclonal. 

55 In a second type of assay, Digoxin-BSA is used to coat microtiter plates. Renatured BABS (FB-BABS) 
is added to the coated plates so that only molecules that have a competent binding site can stick to the 
plate. ^25|-iabelled rabbit IgG (radioligand) is mixed with bound FB-BABS on the plates. Bound radioactivity 
reflects the interation of IgG with the FB domain of the BABS, and the specificity of this binding is 
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demonstrated by its inhibition with increasing amounts of FB, Protein A. rabbit IgG, lgG2a, and IgGI, as 
shown in Figure 7B. 

The following species were tested in order to demonstrate authentic binding: unlabeiled rabbit IgG and 
lgG2a monoclonal antibody (which binds competiviely to the FB domain of the BABS); and protein A and 
5 FB (which bind competivety to the radioligand). As shown in Rgure 7B, these species are found to 
completely inhibit radioligand binding, as expected. A monoclonal antibody of the IgGi subclass binds 
poorly to the FB, as expected, inhibiting only about 34% of the radioligand from binding. These data 
indicate that the BABS domain and the FB domain have independent activity. 

10 IV. OTHER CONSTRUCTS 

Other BABS-containing protein constructed according to the invention expressible in E. coli and other 
host cells as described above are set forth in the drawing. These proteins may be bifunctional or 
multifunctional. Each construct includes a single chain BABS linked via a spacer sequence to an effector 

15 molecule comprising amino acids encoding a biologically active effector protein such as an enzyme, 
receptor, toxin, or growth factor. Some examples of such constructs shown in the drawing include proteins 
comprising epidermal growth factor (EGF) (Rgure 15A), streptavidin (Figure 15B), tumor necrosis factor 
(TNF) (Rgure 15C), calmodulin (Rgure 15D) the beta chain of platelet derived growth factor (B-PDGF) (15E) 
ricin A (15F). interleukin 2 (15G) and FB dimer (15H). Each is used as a trailer and is connected to a 

20 preselected BABS via a spacer (Gly-Ser-Gly) encoded by DNA defining a BamHI restriction site. Additional 
amino acids may be added to the spacer for empirical refinement of the construct if necessary by opening 
up the Bam HI site and inserting an oligonucleotide of a desired length having BamHI sticky ends. Each 
gene also terminates with a PstI site to facilitate insertion into a suitable expression vector. 

The BABS of the EGF and PDGF constructs may be, for example, specific for fibrin so that the EGF or 

25 PDGF is delivered to the site of a wound. The BABS for TNF and ricin A may be specific to a tumor 
antigen, e.g., CEA, to produce a construct useful in cancer therapy. The calmodulin construct binds 
radioactive ions and other metal tons. Its BABS may be specific, for example, to fibrin or a tumor antigen, 
so that it can be used as an imaging agent to locate a thrombus or tumor. The streptavadin construct binds 
with biotin with very high affinity. The biotin may be labeled with a remotely detectable ion for Imaging 

30 purposes. Alternatively, the biotin may be immobilized on an affinity matrix or solid support. The BABS- 
streptavidin protein could then be bound to the matrix or support for affinity chromatography or solid phase 
immunoassay. The interleukin-2 construct could be linked, for example, to a BABS specific for a T-cell 
surface antigen. The FB-FB dimer binds to Fc, and could be used with a BABS in an immunoassay or 
affinity purification procedure linked to a solid phase through immobilized immunoglobulin. 

35 Rgure 14 exemplifies a multifunctional protein having an effector segment as a leader. It comprises an 
FB-FB dimer linked through its C-terminal via an Asp-Pro dipeptide to a BABS of choice. It functions in a 
way very similar to the construct of Fig. 15H. The dimer binds avidly to the Fc portion of immunoglobulin. 
This type of construct can accordingly also be used in affinity chromatography, solid phase immunoassay, 
and in therapeutic contexts where coupling of immunoglobulins to another epitope is desired. 

40 In view of the foregoing, It should be apparent that the invention is unlimited with respect to the specific 
types of BABS and effector proteins to be linked. Accordingly, other embodiments are within the following 
claims. 

Claims 

45 

1, A single chain multi-functional biosynthetic protein expressed from a single gene derived by recom- 
binant DNA techniques, said protein comprising: 

a biosynthetic antibody binding site capable of binding to a preselected antigenic determinant and 
comprising an amino acid sequence homologous with the sequence of a variable region of an 
50 immunoglobulin molecule capable of binding said preselected antigenic determinant, 

a first biofunctional domain comprising a polypeptide selected from the group consisting of effector 
proteins having a conformation suitable for biological activity in mammals, amino acid sequences 
capable of sequestering an ion, and amino acid sequences capable of selective binding to a solid 
support, and 

55 a first polypeptide linker disposed between said binding site and said first biofunctional domain, 

wherein said polypeptide linker comprises plural, hydrophilic, peptide-bonded amino acids and which 
defines a polypeptide which connects the C-terminal end of said binding site and the N-terminal end of 
said first biofunctional domain or the N-terminal end of said binding site and the C-terminal end of said 
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first biofunctional domain, whereupon said binding protein assumes a conformation suitable for binding 
and said first biofunctional domain assumes a conformation suitable for biological activity, sequestering 
an Ion, or selectively binding a solid support. 

5 2. The protein of claim 1 wherein said binding site comprises at least two binding domains peptide 
bonded by a second polypeptide linker disposed between said domains, wherein said second 
polypeptide linker comprises plural, hydrophilic, peptide-bonded amino acids and which defines a 
polypeptide of a length sufficient to span the distance between the C-terminal end of one of said 
binding domains and the N-terminal end of the other of said binding domains when said binding protein 

10 assumes a conformation suitable for binding when disposed in aqueous solution. 

3. The protein of claim 2 wherein the amino acid sequence of each of said binding domains comprises a 
set of CDRs interposed between a set of FRs, each of which is respectively homologous with CDRs 
and FRs from a said variable region of an immunoglobulin molecule capable of binding said 

75 preselected antigenic determinant. 

4. The protein of claim 2 wherein at least one of said binding domains comprises a said set of CDRs 
homologous with the CDRs in a first immunoglobulin and a set of FRs homologous with the FRs in a 
second, distinct immunoglobulin. 

20 

5. The protein of claim 1 or 2 wherein said effector protein is an enzyme, toxin, receptor, binding site, 
biosynthetic antibody binding site, growth factor, cell-differentiation factor, lymphokine. cytokine, hor- 
mone, a remotely detectable moiety, or anti-metabolite. 

25 6. The protein of claim 1 or 2 wherein said sequence capable of sequestering an ion is calmodulin, 
methallothionein, a fragment thereof, or an amino acid sequence rich in at least one of glutamic acid, 
aspartic acid, lysine, and arginine. 

7. The protein of claim 1 or 2 wherein said polypeptide sequence capable of selective binding to a solid 
30 support is a positively or negatively charged amino acid sequence, a cysteine-containing amino acid 

sequence, streptavidin, or a fragment of Staphylococcus protein A. 

8. The protein of claim 2 comprising a plurality of biosynthetic antibody binding sites, each binding site 
comprising two said binding domains connected by a polypeptide linker. 

35 

9. A DNA encoding the protein of claim 1 . 

10. A host cell harboring and capable of expressing the DNA of claim 9. 

40 11. A single polypeptide chain comprising: a pair of polypeptide domains together defining a site for 
binding a preselected antigen, and being joined through the C-terminus of one to the N-terminus of the 
other by a polypeptide linker, wherein the amino acid sequence of each of said polypeptide domains 
mimics an immunoglobulin variable region, and at least one said domain comprises: 

a set of CDR amino acid sequences together defining a recognition site for said preselected 
45 antigen, wherein said CDR sequences are non-human sequences. 

a set of FR amino acid sequences linked to said set of CDR sequences, wherein said FR amino 
acid sequences are homologous to sequences obtained from a human immunoglobulin, and 

said linked sets of CDR and FR amino acid sequences together defining a chimeric binding domain 
which, when disposed in aqueous solution, assumes a tertiary structure suitable for immunological 
50 binding with said preselected antigen. 

12. A single polypeptide chain comprising: a pair of polypeptide domains defining a site for binding a 
preselected antigen joined by a polypeptide linker spanning the distance between the C-terminal of one 
to the N-terminal of the other, wherein the amino acid sequence of at least one of said polypeptide 
55 domains comprises a recombinant variable region comprising: 

a set of CDR amino acid sequences together defining a recognition site for said preselected 
antigen, wherein said CDR sequences are homologous to sequences obtained from a first im- 
munoglobulin. 
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a set of FR amino acid sequences linked to said set of CDR sequences, wherein said FR amino 
acid sequences are homologous to sequences obtained from a second immunoglobulin, and 

said linked sets of CDR and Fr amino acid sequences together defining a chimeric single chain 
variable region binding polypeptide which, when disposed in aqueous solution, assumes a tertiary 
5 structure suitable for Immunological binding with said preselected antigen. 

13. The polypeptide chain of claim 12 wherein 

the FR sequences are homologous with sequences obtained from a human immunoglobulin and 
the CDR sequences are homologous with sequences obtained from a murine immunoglobulin. 

10 

14. The polypeptide chain of claim 11 or 12 wherein said pair of polypeptide domains are peptide bonded 
to a biologically active domain via a second polypeptide linker disposed between said binding 
polypeptide and said biologically active domain, wherein said second polypeptide linker comprises 
plural, hydrophilic. peptide-bonded amino acids, which together assume an unstructured polypeptide 

;5 configuration in aqueous solution, and connect the C-terminal end of said binding site and the N- 
terminal end of said biologically active domain or the N-terminal end of said binding site and the C- 
terminal end of said biologically active domain, whereupon said binding polypeptide assumes a 
conformation suitable for binding and said biologically active domain assumes a conformation suitable 
for biological activity, sequestering an ion, or selectively binding a solid support. 

20 

Patentansprtiche 

1. Einzelkettiges multifunktionelles biosynthetisches Protein, das durch ein durch rekombinante DNA- 
Techniken erhaltenes Einzelgen exprimiert wird, wobei das Protein aufweist: 

25 eine biosynthetische Antikorperbindungsstelle, die an eine vorgewahlte antigene Determinante binden 
kann und eine Aminosauresequenz aufweist, die homolog zu der Sequenz einer variablen Region eines 
ImmunglobulinmolekCils ist, das die vorgewahlte antigene Determinante binden kann, 
eine erste biofunktionelle Domane, die ein Polypeptid aufweist, das unter Effektorproteinen mit einer fur 
biologische Aktivitat in Saugern geeigneten Konformation, Aminosauresequenzen, die ein Ion maskie- 

30 ren konnen, und Aminosauresequenzen, die selektiv an einen festen Trager binden konnen, ausgewahit 
ist und 

einen ersten Polypeptidlinker, der zwischen der Bindungsstelle und der ersten biofunktionellen Domane 
angeordnet ist, wobei der Polypeptidlinker mehrere hydrophile, peptidgebundene Aminosauren aufweist 
und ein Polypeptid definiert, welches das C-terminale Ende der Bindungsstelle und das N-terminale 
35 Ende der ersten biofunktionellen Domane oder das N-terminale Ende der Bindungsstelle und das C- 
terminale Ende der ersten biofunktionellen Domane verbindet, wodurch das Bindungsprotein eine zum 
Binden geeignete Konformation annimmt und die erste biofunktionelle Domane eine fiir die biologische 
Aktivitat, Maskierung eines Ions oder selektive Bindung eines festen Tragers geeignete Konformation 
annimmt. 

40 

2. Protein nach Anspruch 1, wobei die Bindungsstelle mindestens zwei Bindungsdomanen aufweist, die 
durch einen zweiten Polypeptidlinker peptidgebunden sind, der zwischen den Domanen angeordnet ist, 
wobei der zweite Polypeptidlinker mehrere hydrophile peptidgebundene Aminosauren aufweist und ein 
Polypeptid mit einer ausretchenden Lange zur Oberbruckung des Abstandes zwischen dem C- 

45 terminalen Ende einer der Bindungsdomanen und dem N-terminalen Ende der anderen der Bindungs- 
domanen definiert, wenn das Bindungsprotein eine zur Bindung geeignete Konformation einnimmt, 
wenn es in waflrige Losung gebracht wird. 

3. Protein nach Anspruch 2, wobei die Aminosauresequenz jeder der Bindungsdomanen einen Satz von 
50 CDRs aufweist, die zwischen einem Satz von FRs angeordnet sind, wetche jeweils entsprechend 

homolog zu CDRs und FRs einer variablen Region eines ImmunglobulinmolekUls sind, das die 
vorgewahlte antigene Determinante binden kann. 

4. Protein nach Anspruch 2, wobei mindestens eine der Bindungsdomanen einen Satz von CDRs, die zu 
55 den CDRs eines ersten Immunglobulins homolog sind, und einen Satz von FRs, die zu den FRs eines 

zweiten, verschiedenen Immunglobulins homolog sind, aufweist. 
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5. Protein nach Anspruch 1 oder 2, wobei es sich bei dem Effektorprotein um ein Enzym, ein Toxin, einen 
Rezeptor, eine Bindungsstelle, eine biosynthetische Antikorperbindungsstelle, einen Wachstumsfaktor, 
einen Zelldifferenzierungsfaktor, ein Lymphokin, ein Cytokin, ein Hormon. einen indirekt nachweisbaren 
Rest Oder einen Antimetaboliten handelt. 

6. Protein nach Anspruch 1 oder 2, wobei es sich bei der ein Ion maskierenden Sequenz um Calmodulin. 
Methallothionein, ein Fragment davon oder um eine Aminosauresequenz, die zumindest reich an einer 
der Aminosauren Glutaminsaure. Asparaginsaure, Lysin und Arginin ist. handelt. 

7. Protein nach Anspruch 1 oder 2. wobei es sich bei der selektiv an einen festen Trager bindenden 
Polypeptidsequenz um eine positiv oder negativ geladene Aminosauresequenz. eine cysteinhaltige 
Aminosauresequenz. Streptavidin oder ein Fragment von Protein A aus Staphylococcus handelt. 

a Protein nach Anspruch 2. das eine Vielzahl von biosynthetischen Antikorperbindungsstellen aufweist, 
wobei jede Bindungsstelle zwei der durch einen Polypeptidlinker verbunden Bindungsdomanen auf- 
weist. 

9. DNA, die das Protein nach Anspruch 1 codiert. 

10. Wirtzelle, welche die DNA nach Anspruch 9 enthalt und exprimieren kann. 

11. Poiypeptideinzelkette, die aufweist: 

ein Paar PolypeptiddomSnen. die zusammen eine Bindungsstelle fur ein vorgewahltes Antigen definie- 
ren und Gber einen Polypeptidlinker mit dem C-Terminus der einen an den N-Terminus der anderen 
gebunden sind, wobei die Aminosauresequenz jeder der Polypeptiddomanen eine variable Region 
eines Immunglobulins nachahmt und rhindestens eine der Domanen aufweist: 
einen Satz CDR-Aminosauresequenzen, die zusammen eine Erkennungsstelle fOr das vorgewahlte 
Antigen definieren. wobei es sich bei den CDR-Sequenzen um nichtmenschliche Sequenzen handelt, 
einen Satz FR-Aminosauresequenzen, die an den Satz CDR-Sequenzen gebunden sind, wobei die FR- 
Aminosauresequenzen zu aus einem menschlichen Immunglobulin erhaltenen Sequenzen homolog 
sind. 

wobei die gebundenen Satze CDR- und FR-Aminosauresequenzen zusammen eine chimare Bindungs- 
domane definieren, die in waBriger Losung eine zur immunologischen Bindung mit dem vorgewahlten 
Antigen geeignete Tertiarstruktur annimmt. 

12. Poiypeptideinzelkette, die aufweist: 

ein Paar Polypeptiddomanen, die eine Bindungsstelle fiir ein vorgewahltes Antigen definieren und 
durch einen Polypeptidlinker verbunden sind, der den Abstand zwischen dem C-Terminus der einen 
und dem N-Terminus der anderen Oberbrackt, wobei die Aminosauresequenz mindestens einer der 
Polypeptdiddomanen eine rekombinierte variable Region enthalt, die aufweist: 
einen Satz CDR-Aminosauresequenzen, die zusammen eine Erkennungsstelle fur das vorgewahlte 
Antigen definieren, wobei die CDR-Sequenzen zu aus einem ersten Immunglobulin erhaltenen Sequen- 
zen homolog sind, 

einen Satz FR-Aminosauresequenzen, die an den Satz CDR-Sequenzen gebunden sind, wobei die FR- 
Aminosauresequenzen zu aus einem zweiten Immunglobulin erhaltenenen Sequenzen homolog sind, 
wobei die gebundenen Satze CDR- und FR-Aminosauresequenzen zusammen ein chimares Einzelket- 
tenbindungspolypeptid mit variabler Region definieren, das in waBriger Losung eine zur immunologi- 
schen Bindung mit dem vorgewShlten Antigen geeignete Tertiarstruktur annimmt. 

13. Polypeptidkette nach Anspruch 12, wobei die FR-Sequenzen zu aus einem menschlichen Immunglobu- 
lin erhaltenen Sequenzen homolog sind und die CDR-Sequenzen zu aus einem Mause-lmmunglobulin 
erhaltenen Sequenzen homolog sind. 

14. Polypeptidkette nach Anspruch 11 oder 12, wobei das Paar Polypeptiddomanen an eine biologisch 
aktive Domane peptidgebunden ist, und zwar Ober einen zweiten Polypeptidlinker, der zwischen dem 
Bindungspolypeptid und der biologisch aktiven Domane angeordnet ist, wobei der zweite Polypeptidlin- 
ker mehrere hydrophile peptidgebundene Aminosauren aufweist, die zusammen in wasriger Losung 
eine unstrukturierte Polypeptidkonfiguration annehmen, und das C-terminale Ende der Bindungsstelle 
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und das N-terminale Ende der biologisch aktiven Domane oder das N-terminale Ende der Bindungsstel- 
te und das C-terminale Ende der biologisch aktiven Domane verbindet, wodurch das Bindungspolypep- 
tid eine zur Bindung geelgnete Konformation annimmt und die biologisch aktive Domane eine fur die 
biologische Aktivitat, zur Maskierung eines Ions oder zur selektiven Bindung eines festen Tragers 
geelgnete Konformation annimmt. 

Revendicatlons 

1. Une proteine biosynthetique multifonctionnelle a chaTne unique, exprimee a partir d'un gene unique 
derive par des techniques d'ADN recombinant, tadite proteine comprenant: 

un site de liaison d'un anticorps biosynthetique capable de se lier k un determinant antig^nique 
preselectionne et comprenant une sequence d'acides amines homologue a la sequence d'une region 
variable d'une molecule d'immunoglobuline capable de lier ledit determinant antigenique preselection- 
ne, 

un premier domaine biofonctionnel comprenant un polypeptide selectionne dans !e groupe consis- 
tant en des proteines effectrices ayant une conformation appropriee pour une activite biologique chez 
des mammifdres. des sequences d'acides amines capables de s^questrer un ion et des sequences 
d'acides amines capables de se lier s^lectivement a un support solide, et 

un premier polypeptide de jonction dispose entre ledit site de liaison et ledit premier domaine 
biofonctionnel . ou ledit polypeptide de jonction comprend des acides amines multiples, hydrophiles, 
relies en un peptide et qui definit un polypeptide qui relie I'extremite C-terminale dudit site de liaison et 
I'extremite N-terminale dudit premier domaine biofonctionnel ou I'extremite N-terminale dudit site de 
liaison et Textremite C-terminale dudit premier domaine biofonctionnel. aprfes quoi ladite proteine de 
liaison adopte une conformation appropriee pour la liaison et ledit premier domaine biofonctionnel 
adopte une conformation appropriee pour une activite biologique, la sequestration d'un ion ou la liaison 
selective h un support solide. 

2. La proteine de la revendication 1, dans laquelle ledit site de liaison comprend au moins deux domaines 
de liaison relies en un peptide par un second polypeptide de jonction dispose entre lesdits domaines. 
oCi ledit second polypeptide de jonction comprend des acides amines multiples, hydrophiles. relies en 
un peptide et qui definit un polypeptide d'une longueur suffisante pour franchir la distance s'eparant 
I'extremite C-terminale de I'un desdits domaines de liaison de I'extremite N-terminale de I'autre desdits 
domaines de liaison, quand ladite proteine de liaison adopte une conformation appropriee pour la 
liaison , quand elle est placee en solution aqueuse, 

3. La proteine de la revendication 2, dans laquelle la sequence d'acides amines de chacun desdits 
domaines de liaison comprend un jeu de CDR interpose entre un jeu de FR. chacun d'entre aux etant 
respectivement homologue aux CDR et FR de ladite region variable d'une molecule d'immunoglobuline 
capable de lier ledit determinant antigenique preselectionne. 

4. La proteine de la revendication 2, dans laquelle au moins un desdits domaines de liaison comprend un 
jeu de CDR en question, homologue aux CDR de la premiere immunoglobuline et un jeu de FR 
homologue aux FR d'une seconde immunoglobuline distincte. 

5. La proteine de la revendication 1 ou 2, dans laquelle ladite proteine effectrice est une enzyme, une 
toxine, un recepteur, un site de liaison, un site de liaison d'un anticorps biosynthetique, un facteur de 
croissance, un facteur de differentiation cellulaire, une lymphokine. une cytokine, une hormone, un 
fragment detectable k distance ou un anti-metabolite. 

6. La proteine de la revendication 1 ou 2, dans laquelle ladite sequence capable de sequestrer un ion est 
la calmoduline, la metallothioneine, un fragment de celles-ci ou une sequence d'acides amines riche en 
au moins I'un d'entre les aclde glutamique. acide aspartique, lysine et arginine. 

7. La proteine de la revendication 1 ou 2, dans laquelle ladite sequence polypeptidique capable de se lier 
setectivement k un support solide est une sequence d'acides amines chargee positivement ou 
negativement, une sequence d'acides amines contenant de la cysteine, la streptavidine ou un fragment 
de la proteine A de Staphylococcus. 
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a La prot^ine de la revendication 2 comprenant une plurality de sites de liaison d'un antlcorps 
biosynthetiques, chaque site de liaison comprenant deux desdits domaines de liaison relics par un 
polypeptide de jonction. 

9. Un ADN codant pour la protSine de la revendication 1. 

10. Une cellule hole renfermant et efant capable d'exprlmer I'ADN de la revendication 9. 

11. Une chaTne polypeptidique unique comprenant: 

une paire de domaines polypeptidiques d^finissant ensemble un site pour la liaison k un antig6ne 
pr6s6lectionn6 et 6tant joints par rinterm^diaire du C-terminal de Tun et du N-termlnal de ('autre par 
un polypeptide de jonction, oCi la sequence d'acides amines de chacun desdits domaines polypeptidi- 
ques mime une region variable d'une immunoglobuline et au moins un desdits domaines comprend- 

un jeu de sequences d'acides amines CDR dSfinissant ensemble un site de reconnaissance pour 
ledit anbgene preselectionnd, dans lequel lesdites sequences CDR sont des sequences non-humaines 

un jeu de sequences d'acides amines FR relives audit jeu de sequences CDR, dans lequel lesdites 
sequences d'acides amines FR sont homologues aux sequences obtenues a partir d'une immunoqlobu- 
line humalne, et 

lesdits jeux relids de sequences d' acides amines CDR et FR definissant ensemble un domaine de 
liaison chim^nque qui, quand il est p\ac6 en solution aqueuse. adopte une structure tertiaire appropri^e 
pour une liaison immunologique avec ledit antlgene preselectionne. 

12. Une ctiaine polypeptidique unique comprenant: 

une paire de domaines polypeptidiques definissant un site pour la liaison a un antlgene pr6s6lec- 
tionne. relies par un polypeptide de jonction franchissant la distance separant le C-terminal de Tun du 
N-terminal de I'autre, dans laquelle la sequence d'acides amines de au moins un desdits domaines 
polypeptidiques comprend une region variable recombinante comprenant: 

un jeu de sequences d'acides amines CDR definissant ensemble un site de reconnaissance pour 
ledit antigfene pr6seiectionn6, dans lequel lesdites sequences CDR sont homologues aux sequences 
obtenues a partir d'une premiere immunoglobuline, 

un jeu de sequences d'acides amines FR, re\\6 audit jeu de sequences CDR. dans lequel lesdites 
sequences d'acides amines FR sont homologues aux sequences obtenues k partir d'une seconde 
immunoglobuline, el 

lesdits jeux relies de sequences d'acides amines CDR et FR definissent ensemble un polypeptide 
de liaison d'une region variable, 4 chaTne unique, chimerique. qui, quand II est place en solution 
aqueuse, adopte une structure tertiaire appropriee pour une liaison immunologique avec ledit antlgene 
preselectionne. " 



ues aux 



13. La ChaTne polypeptidique de la revendication 12. dans laquelle les sequences FR sont homologu 
sequences obtenues h partir d'une immunoglobuline humalne et les sequences CDR sont homologues 
aux sequences obtenues h partir d'une immunoglobuline murine. 

14. La ChaTne polypeptidique de la revendication 11 ou 12. dans laquelle ladite paire de domaines 
polypeptidiques est un peptide relie k un domaine biologiquement actif. par I'intermediaire d'un second 
polypeptide de jonction dispose entre ledit polypeptide de liaison et ledit domaine biologiquement actif 
dans laquelle ledit second polypeptide de jonction comprend des acides amines multiples, hydrophiles 
relies en un peptide, qui. ensemble, adoptent une configuration polypeptidique non-structuree en 
solution aqueuse et relient I'extremiie C-terminale dudit site de liaison et I'extremite N-terminale dudil 
domaine biologiquement actif ou I'extremite N-terminale dudit site de liaison et I'extremite C-termlnale 
dudit domaine biologiquement actif. apres quoi ledit polypeptide de liaison adopte une conformation 
appropriee pour la liaison et ledit domaine biologiquement actif adopte une conformation appropriee 
pour une activite biologique. la sequestration d'un ion ou la liaison selective h un support solide. 
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10 20 30 <0 50 60 70 

CAATTCCIACTTCAACTCCACCXCTCTCGTCCTCAATTCCTTAAACCTCCCCCCTCTCTCCCCATCTCCT 
CluPhtCluVa:clnLtuClnGla5«rGlyProCluL«uValLysProClyAIaScrValAr|HttStrC 

AauII Bbvl Avail Ahtll Hhal 

Eeotr Fiiu«HI Sau96X Baal HlctFI 

TaqI fstX EeoBII HatlNlalXI 

HaaXX FapX 
Hhal 
HlnPX 
Nar X 
NlalV 
SerFI 
Aoyl 

BO 90 100 110 120 130 lao 

CCAAATCC7CTCCCTACATTTTCACCCACTTCTACATCAATTCCCTTCCCCACTCTCATCCTAAGTCTCT 
yalyaSTStrClyTyr UtPheThr AapFhcTyrMetAsn TroValArKClnSTHlaCl vLvaSTLe 
Raal HphI Nlalll BstXI Nlalll Xba 

Ha 

150 160 170 ISO 190 200 210 

ACACTACATCCCCTACATTTCCCCATACTCTCCCGTTACCCCCTACAACCAGAACTTTAAACCTAACCCC 
uAapTyr IleCIy Tyrllf SerProTyrSerClyValThrClyTyrAsnClnLyaPhe LyaGIytyiAla 
I Raal Bat£Il Oral 

tX Hpall 

Haelll 

220 230 2«0 250 260 270 2B0 

ACCC77AC7C7CCACAAA7CT7CC7CAAC7CCTTACA7GCACC7CCC77C7TTCACCTC7CAGCAC7CCC 
7hrL*uThr¥alAapLysS«rSerSer7hrAlaTyrMetGluLtuArgS«rtauThpSf rCluAapSarA 
Aeel HbolX AUl Ddtl KinflFn 

HlncXI BlaXttBbvI Sac 

Sail FnuUHI 
7aql 

290 300 310 320 330 3M0 350 

CCCTA7ACTA77CCGCCCGC7CC7CTCC7AACAAA7CCCCCA7CCA7TAC7CCCC7CATCCCCCC7C7G7 
layalTyrTyrCyaAlaClyStr StrClyAsnLYstrpAlaHet AapTYrTrpClvHlsClyAlaSerVa 
uOII Hhal Banll Haalll Haalll Ahall Ma 

IIAccI FbuoTT NqoI 

HlnPIHlalV HlalX: 

Sau96I 
Styl 

360 370 
7AC7CTA7CC7CA7AGCA7CC 
lThrVilScrScr«aaAap 
till BaaHl 

XhoIX 
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10 20 30« «0 50 60 TO 

ClATTCCACCTCCTAATCACCCACACTCCCCTCTCTCTCCCCCTTTCTCTCCCTCACCACCCTTCTATTT 
CluPhcA9pValV«lHitThrClnThrProU«u5trLtuProValStrU«uClyAspCinAlaScrIltS 
EeoRI AatZI Hlnfl Hpall BstEII 

Ahall HphI EcoRII 

Taqi ScrFl 
Acyl Mitlll 
Mtell 

80 90 too 110 . 120 130 140 

CTTCCCCCTCTTCCCACTCTCTCCTCCATTCTAATCCTAACACTTACCTCAACTCCTACCTCCAAAACCC 
erCy3 ArgScrSerClnSTL€uValHl3Str<tanClyA9nThrTyr LeuA3nTrpTyrHuGlnLyaAl 
FnuiHI Avail Hatlll HglEII Ban! 

Mboll BatXI Kpnl 

Sau96Z H1*1V 

Raal 

150 160 170 180 190 200 210 
TCCTCACTCTCCCAACCTTCTCATCTACAAAC7CTCTAACCCCTTCTCTCCTCTCCCGCATCCTTTCTCT 
aClyClnStrProLysLcuLtuIleTyr LyaValSerAanArgPheSer CIyValProAapArgPhaSap 
Alul S»u3A Hpall 
Hlnfllll NcllSau3A 
Scrfl 

220 230 240 250 260 270 280 

CCTTCTCCTTCTCCTACTCACTTCICCCTCAACATCTCTCCTCTCCACCCCCACCATCTCCCTATCTACT 
ClySerClyScrClyThrA9pPheThrL«uLysIl«SarArgV4lCluAlaCluAapL«uClyIltTyrP 
Raal Hpht Belli TaqlHatlZI Sau3A 

Mboll XhoXI 
Sau3A 
Xholl 

290 300 310 320 330 3*0 350 

TCTCCTCTCACACTACTCATCTACCCCCCACCTTCCCCCCTCCCACCAACCTCCACATCAAACCTTCACCATCC 
rnCyaStrCln ThrThrHl3yaXProPro ThrPhaClyGlyClyThrlyaLeuCimictyaArg*op 

Ddel NlaiXI KflEII BanI Alul 5au3A Haell BamHI 

Raal HlalV Aval NlalV 

Taql Sau3A 
Xhol XhoXI 
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10 20 30 «0 90 60 70 

CAATTCCAACTTCAACTCCACCACTCTCCTCOTCAATTCCTTAAACCTCCCCCCTCTCTCCCCATCTCCT 
GIuPheCluVtlGlnLtuClnClnSerGlyProGluLtuVtlLysProGlyAlAStrValArgHetStrC 

AauII Bbvl Avail Ahall Hh«Z 

EeoRX FnuUKI 5au96Z Ban! HinPI 

Ttql PstI EeoRII HstXNlaXII 

Hatll FspX 
Hhal 
HlttPX 
NarX 
KlaXV 
AcyX 

ao 90 100 no izo 130 i4o 

CCAAATCC7CTCGG 7ACATTTTCACCAATTACTACATCCATTGCCTTCCCCACTCTCA TCG7AACTCTCT 
C A T C 7 AAAA& 7 eCTTAAlfCAY(StA(:iCTAACCCAAC mTC " 

y3Ly3S«rSepGlyTyrXlePhtThrAsnryrTyrIleHi3rrpVaiArgClnSerHl3GlyLy3SepLe 
RsaX HphI FokX BatXI NlaXXI Xtt 

Ha 

150 160 170 160 190 200 210 

AGACTACATCCGCTCGATCTACCCCCGTAATCCTAACACTAAGTACTACAATGAGAACTTT AAACC7AAC 

7GATC7C7CCCACC7ACA7CCGCCCA77ACCA77C7GAT7CA7GA7C77AC7C77GAAA 
uAapTyrXieGiyTrpXleTyrPPoCiyA3nGXyAsn7hrLysTyrTyrAanCluA3nPhetyaGXyLy9 
X Sau3A AvaX Hat XIXDdeXRsal DraX 

•X XhoXI HpalX SeaX 

KclX 
NelX 
Saal 
XoaX 

220 230 240 250 260 270 280 

GCGACCC77AC7G7CCACAAA7C77CC7CAAC7CC77ACA7CGACC7GCC77C777CACC7C7CAGGAC7 
AXa7hrteu7hrValA9pLy3SepSepSerThrAla7yrHetGXuLeuArgSePLeu7hrSepGXuAapS 
AccX HboXX AXuX OdeX Hinf 

HlncXX NXaXXXBbvI 
SalX FnuMHX 
TaqX 

290 300 310 320 330 lUO 350 

CCCCCG7A7AC7A77CCCCCCGC7CC7C7CG7AACAAA7 CCCCC77CCAT7AC7CCCGTCA7CC CCCC7C 

GCAAGC7AATCACCCCAC7ACCCC 
•PAiaVal7yr7yrCyaAIaCXySepS«pCXyA3nLy97ppAIaPneAapTypTppCiyHl3GlyAlaSa 
X AccX HhaXBanXX HaaXIX Hat XIX AhaXI 

FnuOXX FnuDII Sau96 I7aqI BanI 

SacXX HlnPIHlalV HaaXX 

Khal 
HlnPI 

360 370 Wapl 

7C7TAC7C7A7CC7CA7ACCA7CC NX all X 

pValThpValSepSep*am NlaX'V 
HacXIX BaoHi Acyl 

Sau3A Fl&l. 
XhoXI 
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10' 20 30 HO 50 60 70 

CUTTCCACCTCCTAATCACCCACACTCCCCTCTCTCTCCCCCTTTCTCTCCCTCACCACCCTTCTITTT 
CIuPhtAspValValMttThrCInThrProLtuS«rLtufroV«lSirLtuGlyAapGlnAia5«rlltS 
EcoRI AAtIZ »lnri HpaXI BttCIZ 

Anail Hphi EeoRZi 

T*ql Scrri 
Aeyl HaallX 
MacII 

60 90 100 no 120 130 140 

CTTCCCCCTCTTCCCACTCTATTCTCCACTCTAATCCTAACACTTACCTCCATTCCTAC CTCCAAAACCC 

AACGGCCACAACCGTCACATAACACCTGACATTACCATTCTCAATQCACCTAAC 
ercy3Arg5epSerGin5erii«vaiHl3S«PAsnCiyAsnTnrrypL«uAspTrpTyrLeuCXnLyaAI 
fnuKHI HgiAI Haelll EcoRII 6anl 

H60II Scpfl Kpnl 

HiiEii niTv 

ftsal 

ISO 160 170 180 190 200 210 

TCCTCAGTCTCCCAACCTTCTCATCTACAAACTCTCTAACCCCTTCTCTCCTCTCCCCCATCCTTTCTCT 
aGIyGXnStrProLyaLeuLtuIleTyrLysValSerAanArgPhaSerClyValProAapArgPhtStP 
AXuI Sau3A HpaXX 
HinflXII HclXSau3A 

SerFI 

220 230 2«0 250 260 270 2S0 

CGTTCTCCTTCTCCTACTCACTTCACCCTCAACATCTCTCCTCTCCAGGCCGACCATCTCCCTATCTACT 

CCCTCCTUACCCAtAaATCA' 

CXySerGIySerGXyThPAapPheThPLeuLyalltStPArgVaXCluAXaCXuAapLeuGXyXXeTypT 
Raal Hphi BgXXI TaqlJUeXII Sau3A 

HftoXX XhoXI 
Sau3A 
XhoII 

290 300 310 320 330 3tt0 350 

ACTCCTTCCACCCCTCTCATCTACCC7CCACCTTCCCCCCTCCCACCAAGC TCSAGATCAAACCTTCACCATCC 
TGACCAAGGTCCCCAGACTACATCGCACCTCbAACCCCCCACCCTCCTTCCACCT 
ypCy3Ph«ClnClyStPHl3ValPPoTrpThpPhtClyCXyCiyThPLyat«uClui:«LysAPg«op 

EeoRXI HXaXXZ AvaXX BanI AXuX Sau3A HaaZX BamHX 

ScpFZ Raal Sau96Z HlalV AvaX HlalV 

HglEXI TaqX Sau3A 

Xhol Xholl 
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10 20 30 «0 50 60 70 

CAATTCITCGAACTACAACTCCAACAATCTCCCCCCCCTCTCCTACCTCCCTCTCACACTCTCTCCCTCA 
CluPheNetCluTalClnLtuCXnClnStrGlyfroGlyLtuVilAriProSerClnThrLiuStrLtuT 
EeoHZHIalll XjiX ApaZHpall Rati DdalHlnfZ 

fianll Haat: Tthllll 

Ha«XZI 
Moll 
NlaXV 
Sau96Z 
Sau96Z 
ScrFZ 

SO 90 100 no 120 *30 IttO 

CTTCTACCCTATCCCCATCCACCTTCTCTAACTACTACATCCATTCGCTCCCTCAACCCCCCCCTCCTCC 
hr Cya Thr ValSerClySe rThrPheSerAan TyrTy rile HI 3 TrpValArtCln Pro Pro GlyArgCl 
Rsal BanHI Fokl Avall HincII Hpall 

NlalV Nelt 
Sau96r SerFI 




150 160 170 180 190 200 210 

TCTCCACTCCATCCCTTGGATTTACCCCCGTAATCCTAACACTAAGTACTACAATGACAACT7TAAAGCC 
YlauCluTrpZXaGIy Trp Zl eTyrProCl yAsnCl yAsnThr LyaTyrTyr AanClu Asn PheLyaGly 
Sau3A Aval Hat I tlOdt IRsa I Dral N 

HpaZZ Seal Sp 

HeiZ 

Nell 
SerFZ 

ScrFZ 
Saal 
XaaZ 

220 230 240 250 260 270 280 

ATCCTCCTCCACACTTCTAACAACCAATTC7CTCTCCCTCTCTCTTCTCTTACCCCCCCTCATACTCCTC 
HetleuValAapThrSerlysAsnGlnPheStrLeuArcteuScrSerValThrAlaAlaAspThrAIaV 
laZXZ AeeZ DdeZXanI Hgal HboII HaellZFnutiHI 

hi HincZZ BbvIZ FnuOZZ 

SalZ SacIZ 
TaqZ 

290 300 310 320 330 3<0 350 

TCTACTACTCCCCCCGTTCCTCCCG7AATAAC7GCCCA7T7CA7iACTCCCCCCAGCCCTCTCTCCTCAC 
alTyrTyrCyiAlaArc StrSerGlyAsnLyaTrpAlaPheAspTyrTrpCly GlnClySerLcuValTh 
RaaZ BssHIZ Hpall HXalV BanIZ BstEZI 

FnuOZZ eeoRZZ HphI 

FauDZZ HaelZI HaaZZI 

Hhal Sau96I 

HhaZ ScrFZ 
HlnPZ 

360 3T0 
CCTATCCTCT7AACTCCAG 
r ValSerS«r*oeituGXn 
PatZ 
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10 20 30 40 50 60 70 

CAATTCATCCAATC7CT7CTCACTCAGCCCCCS7CTG7ATCTCCTCCACCCCCTCAACCCCTAACTATC7 
GluPhtHetCluS<ryalttu7hrCXnProFroStrVaISerC)yAlaProClyGlnAr|V«17hrIItS 
Eeell Hlnfl OtftIFnu4HI H|lAIHpaII FnuOlI 

ttlalXX Hlafl HeilHineX! Hatlll 

XanI ScrFI Mlul 

SO 90 100 110 120 130 140 

C77CCCC77CC7C7CAC7C7AT7C7CCA77C7AA7CCCAACAC77A7C7GCAA7CC7ACCAACAAC7GCC 
trCya ArgSerSerGlnSerIleValHl3SerA3nClYA3nThrTyrLeuCIu 7rp7yrClnClnLtuPr 
Ddtl BatXl BanI Hp 

Kpnl He 
NlalV Se 
Raal 

150 160 170 180 190 200 210 

CCCCACCGCGCCCAACC7CC7CA7C777AAAC7A7C7AA7CCC77C7C7CCCG7ACCCGA7CCA77C7C7 
oGlyThrAlaProLysLautiuIlePht Lya VilScrAsnArgPheSerCIy yalProAapArgPhtSfr 
all FnuOII Alul Oral Raal Clal 

it Hhal Bbvl Sau3A Hpall Hlnfl 

rFI HlnPX Fnu4HI Sau3A 

BanI 7aql 

MlatV 

220 230 240 250 260 270 280 

C7A7C7AAG7C7CCC7CC7C7CCCACTC7CCCCA7CACTGG7CTCCAAGCACAACA7CACGCCCA77AC7 
Val5crLyaS«rClySer5crAlaThrLeuAlant7hrGIyLeuGlnAlaCIuAapCluAlaAap7yrT 
Ddtl HlalV Bill Sau3A Mboll Haelll 

290 300 310 320 330 340 350 

AC7C7777CAACCC7C7CA7G7ACCC7GGACC77CCC7CC7CCCACCAAGC77AC7G7AC7CCC7CACCC 
YrCya PheClnClySerHlaValProTrpThrPheCly ClyClyThrLy3lau7hrValLauAfgGlnPr 
NlalXI Avail BanI Alul Raal Hgal 



360 



Raal Sau96I MlalV HlndlTX 

HiiEXI 



G7AAC7GCAG r-i r ilf 

o'ocLtuCln riOi. 1 



PatX 
HatllX 
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ID 20 30 40 50 10 70 

CAACTraUCTCCACCACTCTCCTCCTCJaTTCCTTAAACCTCCCCCCTCTCTCCCCATCTCCTCCAAATCCTCT 
EVQLOOSCPCLVKPGASVRMSCXSS 
Bbvl^ Av»II Ahai: HhtZ HnXU 

Fnu4KI Siu96X BanlMAlI^* HinPI 

PStX ECORZI rspXHiiix: 

HatXX NspHX 
HhaX 
HinPX 
Nsri 
Klaxv 
Scrri 

/'I , I 

I 85 95 105 US I 125 135 145 

CCCTACCCCCACTCTCATCCTAACTCTCTACACTTTAAACGTAACCCCACCCTTACTCTCCACAAATCTrcrrCA 
GYRCSKCXSLDFKCKA7L7VDKSSS 
fiml gJLUJ KUIXX Xbat acoI M&oXI- 

K£I1X am Hindi Mnll* 

Irtnv sail 
xaal Taqi 

1(0 170 ISO 190 200 piO | 220 

ACTCCTTACATCCACCTCC C: I Cfl I CACCTCTCACCACTCCCCCGTATACTATTCCCCCCCTATCCATTA7TCC 
TAVMELRSLTSEDSAVYYCARIDYW 

AluX Ddtl HinfX Aecl AccXi eiii ni 

NlalllSbvi. MnlltMnlX- AecXX AecII TaqZ S 

rnu4HI NSPBII BSSHII 

SaelX Hhal 
Hhal 
KinPI 

Hin?: 



235 245 255 265 

CCCCATCCCCCTACCCTTACCCTCACCTCCTAAGCATCC f/^^ 

CHCASVTVSS'GS 
IXV HaaXX AluX DdelBamHl 

au96I Hhal BanXXHstXIKUXV 
HatXXX KlnPX Bspl2B6 Slu3A 

Ncol NhtX K9iAZ Xholl 

NlaXXX SacX 
St/2 
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10 20 30 40 50 60 70 

GAATTCATGGCTGACAACJUUVTTaUVCAAGGAACACCAGAACGCGTTCTACGAGATCTTC 

EFKADNKFNKCQQKAFYEILHLPNL 
ECORI Mlul BgllZ BspMI-l- 

Xmni 

85 95 105 115 125 135 145 

AACGAAGAGCAGCGTAACGCCTTCATCCAAAGCTTGAAAGACGACCCCTCTCAGAGCGCTAACCTGCTGGCAGAG 
KEEQRNGFIQSLKDDPSQSAKLLAE 

HindZII BspMI-f 

EC047III 

160 170 180 190 200 210 220 

GCCAAGAAACTGAACGACGCTCAGGC6CCGAAGAGTGATCCCGAAGTTCAACTGCACCAGTCTGGTCC7GAATTG 
AXKLNDAQAPKSDPEVQLQQSGPEL 
Narl Psti 

235 245 255 265 275 265 295 

GTTAAACCTCCCGCCTCTGTCCGCATGTCCTGCAAATCCTCTCGGTACATTTTCACCGACTTCTACATGAATTGG 
VK .PGASVRMSCKSSGYIFTDFYMNW 

Narl Fspl 

310 320 330 340 350 360 370 

GTTCGCCAGTCTCATGGTAAGTCTCTAGACTACATCCCCTACATTTCCCCATACTCTGGGGTTACCGGCTACAAC 
VRQSHGKSLDYICVISPYSGVTGYN 
BstXI Xbal PflHI BstEII 

385 395 405 415 425 435 445 

CACAAGTTTAAAGCTAAGGCGACCCTTACTCTCGACAAATCTTCCTCAACTGCTTACATGGAGCTGCCTTCm 
QKFKGKATLTVDKSSSTAYMELRSL 
Dral Sail 

460 470 480 490 500 510 520 

ACCTCTGAGGACTCCGCGCTATACTATTGCCCGGCCTCCTCTGGTAACAAATCCGCCATGGATTATTGGGGTCAT 
TSEDSAVYYCAGSSGNKWAMDYWGH 
SacII Ncol 

535 545 555 565 575 585 595 

GGTCCTAGCGTTACTGTGAGCTCTGGTGGCGGTGGGTCGGGCCGTGGTGCCTCCCCTGGCGCCGGATCCCACGTC 
GASVTVSSGGGGSGGGGSGGGGSDV 
NheZ Sad BamHX Aatll 

610 620 630 640 650 660 670 

GTTGTTACCCAGACTCCGCTGTCTCTGCCGCTTTCTCTGCCTGACCAGGCrrCTATTTCriTGCCGCT 
VVTQTPLSLPVSLGDQASISCRS5Q 

BBtEII PflM 

685 695 705 715 725 735 745 

TCTCTGCTCCATTCTAATGGTAACACTTACCTCAACTCGTACCTCCAAAAGGCTGGTCACTCTCCGAAGCTTCTG 
SLVHSNGNTYLNWYLQKAGQSPKLL 
I BstXI BspHI-f Hindlll 

Xpnl 



FIG, 6A-1 
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770 780 790 800 810 820 

ATCTACAAAGTCTCTAACCGCTTCTCTGGTCTCCCGCATCGTTTCTCTCGTTCT 
lYKVSNRFSGVPDRFSCSCSGTDFT 

845 855 865 875 885 895 

CTGAAGATCTCTCCTCTCCACCCCGAACACCTCGGTATCTACTTCTGCT 

910 920 930 940 

TTTGGTGGTCGCACCAAGCTCGAGATTAAACGTTAACTCCAG 
FGCGTKLEIKR* 

Xhol Hpal PstI 
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10 20 30 40 50 60 

GATCCTGACSTC6TAAT6ACCCAGACTCCGCTCTCTCTGCCGSTTTCTCT6G6TGACCAG 
DPDVVriTQTPLSLPVSLGDO 
AAtll BmtEII 

70 BO 90 100 110 120 

GCTTCTATTTCTTGCC6CTCTTCCCAGTCTCTGGTCCATTCTAATGGTAACACTTACCTG 
A61 SCRSSQSLVHS-NGNTYL 

pfin: BstXI 

130 140 190 160 170 IBO 

AACTGGTACCTGCAAAAGGCTGGTCAGTCTCCGAAGCTTCTGATCTACAAAGTCTCTAAC 
NUYLQKAGQSPKtLX YKVSN 
6spni4- Hindi II 

tCpnl 

1 90 200 210 220 230 240 

CGCTTCTCTGGTGTCCCGGATCGTTTCTCT6GTTCTGGTTCT66TACTGACTTCACCCTG 
RFS6VPDRF56SGSGTDFTL 

230 260 270 280 290 300 

aagatctctcgtgtcgaggccgaagacctgg6tatctacttctgctctcagactactcat 
kisrveaedlgiyfcsqtth 
bqiii 

71C 320 330 340 350 360 

GTACC6CCGACTTTTGGTGGTG5CACCAAGCTC6AGATTAAAC6TGGATCTGGAG6TGGC 
VPPTFGG6TKLEI KRGSG6G 

Xhol 

370 380 390 400 410 420 

G6ATCTBGT6GAG6TGGCTCTGBT66C6GTGGATCC6AAGTTCAATTGCA6CAGTCTGGT 
GSG6GGS66GGSEVQL0QS6 

BamHI 

430 440 450 460 470 460 

CCTBAATT66TTAAACCTGGCGCCTCTGTGCGCATGTCCTGCAAATCCTCTGGGTACATT 
PELVKPGASVRflSCKSSGYI 
NArl Fspl 

490 500 510 520 530 540 

TTCACCGACTTCTACATGAATTGG6TTCGCCAGTCTCATGGTAAGTCTCTAGACTACATC 
FTDFYflNWVRQSHGKSLDYI 

BttXI XbAl 

550 560 570 * 580 590 600 

GG6TACATTTCCCCATACTCTGGG6TTACC6GCTACAACCA6AA6TTTAAAGGTAAGGCG 
GY1SPYSGVT6YNQKFKB> A 
PtXMI BttEII . Dr*I • 

610 620 630 640 650 660 

ACCCTTACTGTC6ACAAATCTTCCTCAACTGCTTACATGGABCTGCGTTCTTTGACCTCT 
TLTVDKSSSTAYflELRSLTS 
SaII 

670 680 690 700 710 720 

gaggactccgcggtatactatt6cgcgggctcctctggtaacaaatgggccatggattat 
edsavyycagssgnkuahoy 

Sac 2 2 Ncol 
730 740 750 760 

tggggtcatggtgctagcgttactgtbasctcttaactgcag 

W 6 H G A H V T V' 5 ? • 



FiGi. (oB 
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no,. >B 
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10 20 30 40 50 60 

CAACTTCAACrrCCAGCAGTCTCCTCCTGCATTGGTTCCACCTTCCCACACTCTCTCCCTG 
EVQLEQSGPGLVRPSQTLSL 

70 80 90 100 110 120 

ACCTGCACATCCTCTGGGTACATTTTCACCGACTTCTACATGAATTCGCTTCCCCAGCCT 

TCTSSGYIFTDFYMNWVRQP 
BspMI+ BstXI 

130 140 150 160 170 180 

CC7GGTCGGGGTCTAGACTACA7CGCGTACA7TTCCCCATACTCTGGGGTTACCCCCTAC 
PCRCLDYICYISPYSCVTCY 
Xbal P£1KI BBtEII 

190 200 210 220 230 240 

AACCAGAACTTTAAAGGTAAGGCGACCCTTCTGGTCAACAAATCTAAGAACCAGGCTTCC 
NQKFKGKATLLVNKSKNQAS 
oral 



250 260 270 280 290 300 

CTCCCGCTGTCTTCTGTGACCGCTGCGGACACCGCCGTATACTATTGCGCGGGCTCCTCT 
LRLSSVTAADTAVYYCAGSS 

sacll 

310 320 330 340 350 360 

G6TAACAAATGCGCCATGGATTATTGGGGTCAGGGTTCTCTGGTTACTGTGAGCTCTCGT 
GKKWAMDYWGQGSLVTVSSG 

Ncol sad 

370 380 390 400 410 420 

GGCGSTGGGTCGGGCGG7GGTGGCTCGGGTGGCGGCGGA7CCGACGTCG7TATGACCCAG 
GGGSGGGGSGGGC5DVVMTQ 

BaaHI Aatll 

430 440 450 460 470 480 

CCTCCCTCGGTTTCCGCGGCTCCTGCTCAGCGGCTTACTATTTCTTCCCGCTCTTCCCAG 
PPSVSGAPGQRVTISCRSSQ 

PflM 

490 500 510 520 530 540 

TCTCTGGTCCATTCTAATGCTAACACTTACCTCAACTGGTACCAGCAACTGCCTGGTACG 

SLVHSNGKTYLNWYQQLPGT 
I BstXI Kpnl 

550 560 570 580 590 600 

CCTCCGAAGCTTCTGATCTACAAAGTCTCTAACCCCTTCTCTGCTGTCCCGGATCCTTTC 
APKLLIYKVSNRFSGVPDRF 
Hlndlll 



610 620 630 640 650 660 

TCTGGTTCTGGTTCTGGTACTGACTTCACCCTCGCGATCACTGCTCTCCAGGCCGAAGAC 
SGSGSGTDFTLAITGLQAED 

670 680 690 700 710 720 

GAGGCTGACTACTTCTGCTCTCAGACTACTCATGTACCGCCGACTTTTGGTGGTGGCACC 
EADYFCSQTTHVPPTFG GGT 

730 740 750 q 

AAGCTCACGGTTCTGCGTTAACTGCAG C \ Cn 1 A 

KLTVLR*LQ r»W, "ri 

Hpal PstI 
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10 20 30 40 50 60 

CAATTCCAACTTCAACTCCXGCACTCTCCTCCTCyiATTCCTTXAACCTCCCCCCTCTCTC 
EFEVQLQQSCPEtVKPCASV 

A«uii psti Hari rs 

CCORI 

70 SO 90 100 110 120 

CCCATCTCCTGCAAATCCTCTGCCTACACCTTCACCAACTXTTACATCCACTCCCTTAAC 
RMSCKSSCYTFTNYYIHHLX 

Aflll 

130 140 150 160 170 180 

CACTCTCATCCTAACTCTC7AGACTCGATCGGT7CGATTTACCCSGGTAATGCTAACACT 
QSHGKSLCWZGHZYPGNGNT 
XbaZ suz 

190 200 210 320 230 240 

AAGTACAATGAGAACTTTAAAGGTAACGCCACCCTTACTGTCCACAAATCTTCCTCAACT 
KYNENFKGKATLTVDKSSST 
DraZ S4lZ 

250 260 270 280 290 300 

CCTTACATCGAGCTCCCTTCTTTCACCTCTCACCACTCCCCGCTATACTATTCCCCCCCT 
A YMELRSLTSEDSAVYYCAR 

SacZZ BssHZI 

310 330 330 340 350 360 

TACACTCATTATTACTTCCATTATTGCGCCCATCGCGCTACCCTTACCGTGAGCTCTGCT 
YTHYYFDYHGHCASVTVSSC 

Neol NhttZ S«CZ 

^''O 380 390 400 410 430 

GCCGCTGGCTCGGCCGCTCCTGCGTCGGGTCGCGCCGCATCCGACCTCCTTATGACCCAC 
GGGSGGGG5GGGGSDVVMT0 

BafflKl AatZZ 

430 440 450 460 470 480 

ACTCCCCTCTCTCTGCCGCTTTCTCTCGCTCACCAGGCTTCTATTTCTTCCCCCTCTTCC 
TPLSLPVSLGOQASISCRSS 

BttEZI 

490 500 510 520 530 540 

CAGTCTATCCTCCATTCTAATCGTAACACTTACCTGCAGTCGTACCTCCAAAAGCCTGCT 
QSIVHSNGNTYLEWYLQKAC 
B«tXI BspMZ-^ 

Xpnl 

550 560 570 580 590 600 

CACTCTCCCAAGCITCTCATCTACAAACTCTCTAACCCCTTCTCTGCTCTCCCCCATCCT 

QSPKLLIYKVSNRFSGVPDR 
HindZZZ 

^10 620 630 640 650 660 

TTCTCTGCTTCTGCTTCTCGTACTCACTTCACCCTGAAGATCTCTCCTCTCGAGGCCCAG 
FSGSGSGTDFTLKZSRVEAE 

BglZZ 

570 680 690 700 710 720 

CATCTCCCTATCTACTACTCCTTCCAAGCGTCTCATCTACCCTGCACTTTCCGCCGTCCG 
OLCZYYCFQGSHVPWTFGCG 

730 740 750 

ACCAACCTCCACATrAAACCTTAACTCCAC C \ S P\ ' 

TKLEZKR*LQ ^ '^51- 

XhoZ HpaZ PstI 
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10 20 30 40 50 60 

GATCCCCACGTTATGCTGGTTGAATCTCGTCGAGTACTCATCGAACCTCCTCCGTCCCTG 
DPEVMLVESCGVLMEPGGSL 

ScAl EcoO 

70 80 90 100 110 120 

AACCTGACCTGTGCTGCTACCCGCTTCACGTTCTCTCGTTACCCCATGTCTTGCGTCCCT 
KLSCAASGFTFSRYAMSWVR 
Espl NhftI PriMI 

130 140 150 160 170 180 

CAGACTCCGGAGAAGCCTCTAGAGTGGGTCGCGACGA7ATCTTCTGGTGGTTCTCACACG 
Q7PEKRLEWVATISSGG5HT 
BspMII XbaX Nrul EcoRV 

190 200 210 220 230 240 

7TCCATCCACACAGTGTGAAGGG7CGATTCACCATC7CTCGAGACAACGCTAAGAACACG 
FHPDSVKGRFTISRDNAKNT 

XhOl 

250 260 270 280 290 300 

mTACCTCCyiAATGTCTTCTCTACGTACTGAAGATACTGCTATGTACTACTGTGCACGT 
LytQMSSLRSEDTAMYYCAR 
BspMI-f SnaBI ApaXJ 

310 320 330 340 350 360 

CCTCCACTGATCTCACTACTTGCTGATTATGCCATCGATTATTGGGGTCATGGTGCTACC 
PPLISLVADYAMDYWGHCAS 
Spel Ncol Nhel 

370 380 390 400 410 420 

GTTACTGTGAGCTCTGGTCGCGGTGGGTCGGGCGGTGCTGGCTCGGGTGCCGGCGGATCG 
V7VSSGGGGSGGGGSGGGGS 
Sad 

430 440 450 460 470 480 

GATATCCTTATGACTCAGTCTCATAAGTTCATGTCCACTTCTGTTGGTGACCCTCTTTCT 

DIVMTQSHKFMSTSVGDRVS 
EcoRV BstElI 

490 500 510 520 530 540 

ATCACTTGTAACCCCAGCCAGGATGTCGGTGCTGCTATCGCATGCTATCAGCAGAAGCCC 
ITCKASQDVGAAIAWYQQKP 
PflMI Sna 

550 560 570 580 590 600 

GGCCAGTCTCCTAAGCTGCTGATCTACTGGGCGTCGACTCGTCATACTGGTGTCCCCCAT 

GQSPKLLIYWASTRHTCVPD 
I Sail 



610 620 630 640 650 660 

CGTTTCACTGGGTCCGGATCAGGTACTGATTTCACTCTGACTATTTCGAACGTTCAGTCT 
RFTGSGSGTDFTLTISNVQS 
BspMII ASUII 

670 680 690 700 710 720 

GATCACCTGGCTGATTACTTCTCCCAGCAATATTCCCCGTACCCTCTGACTTTCGCTCCC 
DDLADYFCQQYSGYPLTFGA 

Sspl Xpnl Nae 

730 740 750 

GGCACTAAACTCGAGCTGAAGTAACTGCAC 

GTKLELK* 
I Xhol PstI 



RGn. 
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10 20 30 40 50 60 

CATCCCGAGCTTATGCTGCTTCAATCTCGTCGAGTACTGATGCAACCTCGTGGGTCCCTC 
DPEVMLVESGGVLMEPGGSL 

Scal £coO 

70 80 90 100 110 120 

AACCTGACCTGTGCTGCTAGCGGCTTOlCGTTCTCrrCGTTACGCCATGTCTTCCGTCCCT 
XLSCAASGFTFSRYAMSWVR 
EspZ Nhel PflMX 

130 140 150 160 170 180 

CAGACTCCGGAGAAGCGTCTAGAGTGGGTCGCGACGATATCTTCTGGTGGTTCCAACAC7 
QTPEKRLEWVATISSGGSNT 
BspMII Xbal Krul EcoRV AstxII 

190 200 210 220 230 240 

TACTATCCAGACAGTGTGAAGGGTCGATTCACGATCTCTCGAGACAACGCTAAGAACACG 
YYPOSVKCRFTISRDNAKNT 

Xhol 



250 260 270 280 290 300 

TTGTACCTGCAAATGTCTTCTCTACGTAGTGAAGATACTGCTATGTACTACTGTGCACGT 
LYLQMSSLRSEDTAMYyCAR 
&spMl4 Sn&BI ApaLI 

310 320 330 340 350 360 

CCTCCACTGATCTCACTAGTTGCTGATTATGCCATGGATTATTCGGGTCATGGTGCTAGC 
PPLISLVADYAMDYWGHGAS 
Spel Ncol Nhel 

370 380 390 400 410 420 

GTTACTGTGAGCTCTGGTGGCGGTGGGTCGCGCGG7CGTGGC7CGGGTGGCGGCGGATCG 
VTVSSGGGGSGGGGSGGGGS 
Sad 



430 440 450 460 470 480 

GATATCGTTATGACTCAGTCTCATAAGTTCATGTCCACTTCTGTTGGTGACCGTGTTTCT 

DIVMTQSHKFMSTSVGDRVS 
EcoRV BstEII 

490 500 510 520 530 540 

ATCACTTGTAAGGCCAGCCAGGATGTGGGTGCTGCTATCGCATGGTATCAGCAGAAGCCC 
ITCKASQDVGAAIAWYQQKP 
PflMI Sma 

550 560 570 580 590 600 

GGGCAGTCTCCTAAGCTGCTGATCTACTGGGCGTCGACTCGTCATACTGGTGTCCCGGAT 

GQSPKLLIYWASTRHTGVPD 
I Sail 



610 620 630 640 650 660 

CGTTTCACTGGCTCCGGATCAGCTACTGATTTCACTCTGACTATTTCGAACGTTCAGTCT 
RFTGSCSCTDFTLTISNVQS 
BspMII AauII 

670 680 690 700 710 720 

GATGACCTGGCTGATTACTTCTCCCAGCAATATTCCGGCTACCCTCTGACTTTCGGTGCC 
DDLADYFCQQYSGYPLTFGA 

Sspl Kpnl Nae 



730 740 750 

GGCACTAAACTCGACCTCAAGTAACTGCAG 

GTKLELK* 
I Xhol Psti 



45 



EP 0 318 554 B1 



10 20 
Hat Lys Ala Il« Pha Val Leu Lys Cly Ser Leu Asp Arg Asp Leu Asp Ser Ar? Leu Asp 
ATG AAA CCA ATT TTC GTA CTG AAA CGT TCA CTG CAC AGA CAT CTG GAC TCT CCT CTG GAT 

BglZI 

30 40 
Leu Asp Val Arg Thr Asp His Lys Asp Leu Ser Asp His Leu Val Leu Val Asp Leu Ala 
CTG GAC GTT CCT ACC CAC CAC AAA GAC CTG TCT CAT CAC CTG CTT CTG GTC CAC CTG CCT 

Bcll Sail 

50 60 

Arg Asn Asp Leu Ala Arg He Val Thr Pro Cly Ser Arg Tyr Val Ala Asp Leu Glu Phe 
CGT AAC CAC CTG CCT CGT ATC CTT ACT CCC CCC TCT CCT TAC CTT CCC CAT CTG CAA TTC 

Smal EcoRI 

A«P PlGi. 10 A 

GAT 



EcoRI 




FIGi. 10 5 



Afim 
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94- 
67- 
5 43- 

^ 29- 

^ 20.1 - 
H.4- 



3 



0 12 3 4 5 



DVQLQESGPGLVKPSQSLSLTCSVTGYSIT 
SGYPWHWIROFPGNXLEWLGFIKYDGSNYG 
NPSLKNRVSITRDTSENQFFLKLDSVTTAT 
YYCAGDNDHLYFDYWGQGTTLTVS 

GGGGSGCGGSGGGGS 

QAVVTOESALTTSPGCTVILTCRSSTGAVT 
TSNYANWIQEKPDHLFTGLIGGTSNRAPGV 
PVRFSGSLIGDKAALTITGAQTEDDAMYFC 
ALWFRNHFVFGGGTKVTVLG 



FIG. 9C 
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I 
I 



i 




-10 -9 -8 -7 -6 -5 -4 

LOG UNBOUND DICOXIN CONCENTRATION IM J 




•11 -10 -9 -8 -7 -6 

LOG UNBOUND DIGOXIN CONCENTRATION IM J 
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10 20 30 40 50 60 

GAATTOITGGCTGACAACAAATTCAACAAGGAACAGCACAACGCGTTCTACGAGATCTTG 

EFMADNKFNKEQQNAFYEIL 
EcoRI MlUl Bglll 

XanI 

70 80 90 100 110 120 

CACCTGCCGAACCTGAACGAAGAGCAGCGTAACGGCTTCATCCAAAGCTTGAAGGATGAG 
HLPNLNEEQRNGFIQSLKDE 
BspMI+ Kindlll 

130 140 150 160 170 ISO 

CCCTCTCAGTCTGCCAATCTGCTAGCGGATGCCAAGAAACTGAACGATGCGCAGGCACCG 
PSQSANLLADAKKLNDAQAP 
Nhel Fspl 

190 200 210 220 230 240 

AAATCGGATCAGGGGCAATTCATGGCTGACAACAAATTCAACAACGAACAGCAGAACGCG 
KSDQGQFMADNKFNKEQQNA 

Hlul 
Xmnl 

250 260 270 280 290 300 

TTCTACGAGATCTTGCACCTGCCGAACCTGAACGAAGAGCAGCGTAACGGCTTCATCCAA 
FYEILHLPNLNEEQRNGFIQ 
Bglll BspMI+ H 

310 320 330 340 350 360 

AGCTTGAAGGATGAGCCCTCTCAGTCTGCGAATCTGCTAGCGGATGCCAAGAAACTGAAC 

SLKDEPSQSANLLADAKKLN 
indlll Nhel 

370 380 -.^ ,J 

CATGCGCACGCACCGAAATCGGATCC V* I ^ . > " 

DAQAPKSDP 
Fspl BamHI 
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(BABS)- 

10 20 30 40 50 60 70 

GCATCCGCTAACTCTCACTCTCAATGCCCCCTCACCCACCACGGGTACTCCCTGCACGACCGTCTTTGCATGTAC 

CSGNSDSECPLSHDGYCLHDGVCMY 
BamHI BsmI-*- Espl 

85 95 105 115 125 135 145 

ATCGAACCTCTCCACAAATACGCATGCAACTGCGTTCTAGCCTACATCGCTGACCGCTCCCAGTATCCCGATCTG 
lEALDKYACNCVVGYICERCQYRDL 
SphI Hrul 

160 170 ^ I C A 

AAATGGTCGGACCTGCGTTAACTGCAG Y \ 0\ * ' *^ ^ 

K W W E L R * 

Hpal PstI 
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10 20 30 40 50 60 

CCATCCCCTCGCGACCCGTCCAAGGACTCCAAAGCTCACCTTTCTGCTCCCGAACCTGGT 

GSGGOPSKDSKAQVSAAEAG 
BasHI 

70 80 90 100 110 120 

ATCACTGGCACCTGGTATAACCAACTGGGGTCGACTTTCATTGTGACCGCTGCTGCGCAC 
ITGTWYNQLGSTFIVTAGAO 

Sail 

130 140 150 160 170 180 

GGAGCTCTGACTGCCACCTACCAATCTGCGCTTGGTAACGCAGAATCCCGCXACCTACTG 
GALTGTYESAVGNAESRYVL 
sad SnaBI 

190 200 210 220 230 240 

ACTGGCCGTTATGACrCTGCACCTCCCACCCATGGCTCTGGTACCGCTCTGGGCTGGACT 
TGRYDSAPATDGSGTALGWT 
BspMI+ Kpnl 

250 260 270 280 290 300 

GTGGCTTGGAAAAACAACTATCGTAATGCGCACAGCCCCACTACGTCGTCTCCCCAATAC 
VAWKNNYRNAHSATTWSGQY 

Fspl Oram Ball 

PflMI BstXI 

310 320 330 340 350 360 

GTTGGCGGTGCTGAGGCTCGTATCAACACTCAGTGCCTGTTAACATCCGGCACTACCGAA 
VGGAEARINTQWLLTSGTTE 

Drain Hpal 

370 380 390 400 410 420 

GCGAATGCATGGAAATCGACACTAGTAGGTCATGACACCTTTACCAAAGTTAAGCCTTCT 

ANA WKSTLVGHDTFTKVKPS 
BSDl*K spel 

Nsil 

430 440 450 460 470 480 

GCTCCTAGCATTGATGCTGCCAAGAAAGCACGCGTAAACAACGGTAACCCTCTAGACGCT 
AASIDAAKKAGVNNGNPLDA 
Nhel BstEII Xbal 

490 500 
GTTCAGCAATAACTGCAG ^ i C Q 

V Q Q ♦ FICSI. 
PStI 
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(BABS)- 

10 20 30 40 50 60 

GGATCCGGTGTACGTAGCTCCTCTCGCACTCCGTCCGATAAGCCGGTTGCTCATGTAGTT 
GSGVRSSSRTPSDKPVAHVV 

Ba&HI SnaBI 

70 80 90 100 110 120 

GCTAACCCTCAGGCAGAAGGTCAGCTTCAGTGGCTGAACCGTCGCGCTAACGCCCTGCTG 
ANPQAEGQLQWLNRRANALL 
Mstll Bgll 

130 140 150 160 170 180 

GCAAACGGCGTTGAGCTCCGTGATAACCAGCTCGTGGTACCTTCTGAAGGTCTGTACCTG 
ANGVELRDNQLV VPSEGLYL 
Sad PflHI Kpnl 

190 200 210 220 230 240 

ATCTATTCTCAAGTACTGTTCAAGGGTCAGGGCTGCCCGTCGACTCATGTTCTGCTGACT 
lYSQVLFKGQGCPSTHVLLT 
Seal Sail 

250 260 270 280 290 300 

CACACCATCAGCCGTATTGCTGTATCTTACCAGACCAAAGTTAACCTGCTGAGCGCTATC 
HTISRIAVSYQTKVNLLSAI 

HpaIBspMI+ EC047III 
Espl 

310 320 330 340 350 360 

AAGTCTCCGTGCCAGCGTGAAACTCCCGAGGGTGCAGAAGCGAAACCATGGTATGAACCG 
KSPCQRETPEGAEAKPWYEP 

NCOI 

370 380 390 400 410 420 

ATCTACCTGGGTGGCGTATTTCAACTGGAGAAAGGTGACCGTCTGTCCGCAGAAATCAAC 
lYLGGVFQLEKGDRLSAEIN 

BstEII 

430 440 450 460 470 480 

CGTCCTGACTATCTAGATTTCGCTGAATCTGGCCAGGTGTACTTCGGTATTATCGCACTG 
RPDYLDFAESGQVYFGIIAL 
Xbal Ball 



490 - - 

TAACTGCAG F I Gfl . 

* 

PstI 
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(BABS) - 



10 20 30 40 50 60 

GGATCCGGTGCTGATCAGCTGACTGACGAGCAGATCGCTGAATTTAAAGAGGCTTTCTCT 

GSGADQLTDEQIAEFKEAFS 
BamHI BclIPvuII Dral 



70 80 90 100 110 120 

CTGTTTGACAAAGACGGTGACGGTACCATCACTACCAAAGAGCTCGGCACCGTTATGCGC 
LFDKDGDGTITTKELGTVMR 

Kpnl Sad Fspl 

130 140 150 160 170 180 

AGCCTTGGCCAGAACCCGACTGAAGCTGAATTGCAGGACATGATCAACGAAGTCGACGCT 
SLGQNPTEAELQDMINEVDA 
Ball Bell Sail 



190 200 210 220 230 240 

GACGGTAACGGCACCATCGATTTTCCGGAATTTCTGAACCTGATGGCGCGCAAGATGAAA 
DGNGTIDFPEFLNLMARKMK 
Clal BspHII BssKII 

250 260 270 280 290 300 

GACACTGACTCTGAAGAGGAACTGAAAGAGGCCTTCCGTGTTTTCGACAAAGACGGTAAC 
DTDSEEELKEAFRVFDKDGN 

Stui 



310 320 330 340 350 360 

GGTTTCATCTCGGCCGCTGAACTGCGTCACGTTATGACTAACCTGGGTGAAAAGCTTACT 
GFISAAELRHVMTNLGEKLT 
EagI Hindlll 

370 380 390 400 410 420 

GACGAAGAAGTTGACGAAATGATTCGCGAAGCTGACGTCGATGGTGACGGCCAGGTTAAC 
DEEVDEMIREADVDGDGQVN 
XmnI Nrul Aatll Hpal 

430 440 450 

TACGAAGAGTTCGTTCAGGTTATGATGGCTAAGTAACTGCAG C I r« I 5 D 
YEEFVQVMMAK* T » VJl . • ^ 

PstI 
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(BABS)- 

10 20 30 40 50 60 

CGATCCGGTGGAGGCTCTCTGGGCTCTCTGACTATTGCCGAACCGGCAATGATTGCTGAA 

bLh? gggslgsltiaepamiae 

Bgll Bsm 

70 80 90 100 110 120 

TGCAAGACTCGTACCGAAGTCTTCGAGATCTCTCGTCGTCTGATCGATCGCACTAATGCC 
CKTRTE VPEISRRLIDR TNA 

Bglll Clal B8 

PVUI 

140 150 160 170 180 

AACTTCCTGGTATGGCCGCCGTGCGTCGAGGTACAACGCTGCTCCGGGTGTTGCAACAAT 
^N^FLVWPPCVEVQRCSGCCNN 

1^0 200 210 220 230 240 

CGTAACGTTCAATGTCGACCGACTCAAGTCCAGCTGCGTCCGGTCCAAGTCCGCAAAATC 
RNVQCRPTQVQLRPVQVRKI 

Sail PvuII 

2S0 260 270 280 290 300 

GAGATTGTACGTAAGAAACCGATCTTTAAGAAGGCCACTGTTACTCTGGAAGACCATCTG 

EIVRKKPIFKKATVTLEDHL 
SnaBI 

310 320 330 340 350 

GCATGCAAATGTGAGACTGTAGCGGCCGCACGTCCAGTTACTTAACTGCy^G 

ACKCETVAAARPVT* 
SphI Eagi PstI 

NotI 

FiGi. 15E 
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(BABS)- 

10 20 30 40 50 60 

GGATCCGGTATATTCCCCAAACAATACCCAATTATAAACTTTACCACAGCGGGTGCCACT 

G SGIFPKQYPIINFTTAGAT 

Bainnl 

70 80 90 100 110 120 

GTGCAAAGCTACAOUUICTTTATCAGAGCTGTTCGCGGTCGTTTAACAACTGGAGCTGAT 
VQSYTNFIRAVRGRLTTGAD 

130 140 150 160 170 180 

GTGAGACATGAAATACCAGTGTTGCCAAACAGAGTTGGTTTGCCTATAAACCAACGGTTT 
VRHEIPVL P.N RVGLPINQRF 

"0 200 210 220 230 240 

ATTTTAGTTGAACTCTCAAATCATGCAGAGCTTTCTGTTACATTAGCGCTGGATGTCACC 
ILVELSNHAELSVTLALDVT 

EC047I1I 

250 260 270 280 290 300 

AATGCATATGTGGTCGGCTACCGTGCTGGAAATAGCGCATATTTCTTTCATCCTGACAAT 

NA YVVGYRAGNSAYFFHPDN 
Ndel 

Nsil 

320 330 340 350 360 

CAGGAAGATGCAGAAGCAATCACTCATCTTTTCACTGATGTTCAAAATCGATATACATTC 
QEDAEAITHLFTDVQNRYTF 

Clal 

370 380 390 400 410 420 

GCCTTTGGTGGTAATTATGATAGACTTGAACAACTTGCTGGTAATCTGAGAGAAAATATC 
AFGGNYDRLEQLAGNLRENI 

*30 440 450 460 470 480 

GAGTTGGCAAATGGTCCACTAGAGGAGGCTATCTCAGCGCTTTATTATTACAGTACTGGT 
ELGNGPLEEAISALYYYSTG 

EC047III Seal 

500 510 520 530 540 

GGCACTCAGCTTCCAACTCTGGCTCGTTCCTTTATAATTTGCATCCAAATGATTTCAGAA 
GTQLPTLARSFIICIQMISE 

550 560 570 580 590 600 

GCAGCAAGATTCCAATATATTGAGGGAGAAATGCGCACGAGAATTAGGTACAACCGGAGA 
AARFQYIEGEMRTRIRYNRR 

Fspl Bgl 
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(BABS)- 

10 20 30 40 50 60 

GGATCCGGTGCTCCGACrrTCTAGCTCTACTAAGAAAACTCAGCTTCAGCTGGAACACCTG 

GSGAPTS5STKKTQLQLEHL 
BamHI PvuII 

70 80 90 100 110 120 

CTGCTGGACCTTCAGATGATCCTGAACGGTATCAACAACTACAAGAACCCGAAACTGACT 
LLDLQMILNGINNYKNPKLT 

130 140 150 160 170 180 

CGTATGCTGACTTTCAAATTCTAOlTGCCGAAGAAAGCrACCGAACTGAAACACCTTC^^ 
RMLTFKFYMPKKATELKHLQ 

190 200 210 220 230 240 

TGCCTGGAAGAAGAACTGAAGCCGCTGGAGGAAGTACTGAACCTGGCTCAGTCTAAAAAC 
CLEEELKPLEEVLNLAQSKN 

Seal 

250 260 270 280 290 300 

TTCCACCTGCGTCCGCGTGACCTGATCAGCAACATCAACGTAATCGTTCTAGAACTTAAA 
FHLRPRDLISNINVIVLELK 

Bell Xbal 

310 320 330 340 350 360 

GGCTCTGAAACTACCTTCATGTGCGAATACGCTGACGAAACTGCTACCATCGTAGAATTT 
GSETTFMCEYADETATIVEF 

370 380 390 400 410 420 

CTGAACCGTTGGATCACCTTCTGCCAGTCTATCATCTCTACTCTGACTTAACTGCAG 
LNRWITFCQSIISTLT* 

PstI 

riCn. 150, 
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(BABS)- 

10 20 30 40 50 60 

GGATCCGGTGCrGACAACAAATTCAACAAGGAACAGCAGAACGCGTTCTACGAGATCTTG 

GSGADNKFNKEQQNAFYEIL 
BamHI Mlul Bglll 

Xmnl 

70 80 90 100 110 120 

CACCTGCCGAACCTGAACGAAGAGCAGCGTAACGGCTTCATCCAAAGCTTGAAGGATGAG 
HLPNLNEEQRNGFIQSLKDE 
BspMI+ Hindlll 

130 140 150 160 170 180 

CCCTCTCAGTCTGCGAATCTGCTAGCGGATGCCAAGAAACTGAACGATGCGCAGGCACCG 
PSQSANLLADAKKLNDAQAP 

Nhel Fspl 

190 200 210 220 230 240 

AAATCGGATCAGGGGCAATTCATGGCTGACAACAAATTCAACAAGGAACAGCAGAACGCG 
KSDQGQFMADNKFNKEQQNA 

Mlul 
Xmnl 

250 260 270 280 290 300 

TTCTACGAGATCTTGCACCTGCCGAACCTGAACGAAGAGCAGCGTAACGGCTTCATCCAA 
FYEILHLPNLNEEQRNGFIQ 
Bglll BspMI+ H 

310 320 330 340 350 360 

AGCTTGAAGGATGAGCCCTCTCAGTCTGCGAATCTGCTAGCGGATGCCAAGAAACTGAAC 
SLKDEPSQSANLLADAKKLN 
indlll Nhel 

370 380 ri r \ 5H 

GATGCGCAGGCACCGAAATAACTGCAG P I wl . 

D A Q A P K * 
Fspl PstI 
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